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Abstract 

This  research  develops  and  evaluates  a  novel  computer  system  for  the  detection  of 
microcalcifications  in  mammograms  using  image  texture  analysis.  The  system  can  provide 
a  second  opinion  to  radiologists  to  decrease  the  number  of  false  readings,  which  include 
diagnosing  a  mammogram  as  containing  no  calcifications  when  there  is  (false  negative)  or 
as  containing  microcalcifications  when  there  is  not  (false  positive).  The  system  follows 
a  Model  Based  Vision  (MBV)  paradigm  for  automatic  detection  of  calcifications.  The 
Focus  of  Attention  Module  utilizes  an  image  difference  technique  followed  by  global  and 
local  thresholding  to  eliminate  nearly  90%  of  the  image  from  further  processing.  A  new, 
unique  feature,  the  Laws  Energy  Ratio,  is  presented.  The  Laws  Energy  Ratios  from  the 
L5R5  and  L5E5  Laws  masks  provide  Indexing  criteria  which  correctly  hypothesized  93% 
of  the  microcalcification  regions  while  reducing  the  number  of  false  regions  by  over  75%. 
A  comparative  study  of  three  different  texture  measures  using  features  calculated  from 
Angular  Second  Moment,  Laws  Energy  Ratios  and  Power  Spectrum  Analysis  is  presented. 
Using  a  neural  network  trained  with  a  modified  backpropagation  algorithm,  the  Power 
Spectrum  Analysis  feature  set  had  the  best  overall  performance  with  an  83%  Probability 
of  Detection  and  an  average  False  ROI  Rate  of  2.17  ROIs  per  image  over  53  mammograms. 
A  combination  of  Laws  Energy  Ratio  and  Power  Spectrum  Analysis  features  selected  using 
Ruck  Saliency  metrics  achieved  an  85%  Probability  of  Detection  with  an  average  4  false 
ROIs  per  image.  Although  not  specifically  developed  for  classifying  regions  as  malignant 
or  benign,  the  system  correctly  identified  89%  of  the  malignant  microcalcification  regions. 
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Computer  Aided  Detection  of  Microcalcifications  Utilizing  Texture  Analysis 


I.  Introduction 

Detection  of  breast  cancer  is  a  difficult  and,  as  of  yet,  unsolved  problem.  Advances  in 
digital  image  processing  techniques  may  lead  to  improvements  in  detection  and  diagnosis  of 
this  disease.  The  Air  Force  Institute  of  Technology  (AFIT)  has  a  long  history  of  applying 
computer  vision  and  image  processing  to  a  host  of  military  related  problems[35,  20,  12,  15, 
33].  It  is  the  goal  of  this  research  to  extend  this  work  into  the  area  of  medical  imaging[17, 
25,  9,  13]. 

1.1  Breast  Cancer  Information 

Breast  cancer  is  a  leading  cause  of  cancer  deaths  among  women,  currently  exceeded 
only  by  lung  cancer,  and  will  eventually  alfect  one  in  nine  women  in  the  United  States[36,  2]. 
In  1994  alone,  the  National  Cancer  Institute  (NCI)  estimated  that  182,000  women  would  be 
newly  diagnosed  with  breast  cancer,  with  approximately  46,000  deaths  from  the  disease[3]. 
The  outlook  for  the  next  several  years  does  not  appear  any  brighter.  The  number  of  newly 
diagnosed  cases  is  expected  to  hold  steady  at  approximately  150,000  each  year [9]. 

Mammography  is  currently  the  best  method  for  the  detection  of  breast  cancer.  But 
in  10-30%  of  women  who  have  breast  cancer,  their  mammograms  were  diagnosed  as  nega¬ 
tive.  The  cancer  missed  by  the  radiologist  was  evident  in  two-thirds  of  these  mammograms 
retrospectively [13].  The  missed  detection  may  be  attributed  to  a  number  of  factors:  the 
subtle  nature  of  the  cancer,  poor  image  quality,  eye  fatigue  or  merely  oversight  by  the  radi¬ 
ologist.  It  has  been  suggested  that  having  the  mammograms  read  by  two  radiologists  may 
improve  detection[22].  This  would  merely  increase  the  existing  high  volume  workload  on 
the  radiologists,  possibly  leading  to  more  missed  cancer  regions.  Computer  aided  diagnosis 
may  be  a  solution  to  the  problem  of  providing  the  radiologist  with  a  “second  opinion”  or 
a  “second  reading”  by  indicating  locations  of  suspect  abnormalities  is  the  mammograms. 
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1.2  Computer  Aided  Diagnosis 

Computer  aided  diagnosis,  or  CADx,  is  an  automated  tool  that  is  based  on  digital 
image  processing  for  the  detection  and  classification  of  breast  cancer.  The  mammographic 
film  can  be  digitized  to  allow  for  the  computer  processing  of  the  image.  The  CADx  system 
will  consist  of  basically  four  main  parts: 

1.  The  system  would  first  identify  possible  cancerous  areas,  or  regions  of  interest,  in  the 
mammogram.  This  is  referred  to  as  Focus  of  Attention. 

2.  An  initial  hypothesis  is  made  as  to  the  classification  of  the  region  of  interest.  This 
step  is  referred  to  as  Indexing. 

3.  The  indexed  regions  are  then  passed  to  a  set  of  algorithms  to  extract  features  re¬ 
quired  to  verify  the  initial  hypothesis  from  the  indexer.  These  features  will  hopefully 
describe  the  critical  diagnosis  essence  of  the  image  and  will  be  passed  on  to  the  final 
stage  of  matching. 

4.  A  classifier  will  attempt  to  match  the  extracted  features  against  predicted  features  to 
identify  the  segmented  region  as  normal/ abnormal  tissue  or  cancerous/benign  tissue. 

The  CADx  system  is  not  being  developed  to  replace  the  radiologist  but  to  assist  them. 
The  primary  objective  of  the  system  is  to  improve  detection  of  breast  cancer  in  hopes  of 
increasing  the  effectiveness  and  efficiency  of  mammographic  screening  [13].  The  addition 
of  classifying  the  suspected  regions  as  cancerous  or  benign  may  reduce  the  number  of 
false-positive  diagnoses,  thereby  decreasing  patient  morbidity  and  the  number  of  surgical 
biopsies  performed.  The  CADx  system  has  the  potential  to  save  lives  while  reducing 
unnecessary  biopsy  and  surgery. 

1.3  Problem  Statement 

Develop  a  CADx  system  to  detect  microcalcifications  in  a  mammogram  using  an 
image  differencing  technique  with  a  global  and  local  thresholding  scheme  for  focus  of  at¬ 
tention,  create  an  initial  indexing  hypothesis  from  cluster  and  texture  analysis  information, 
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extract  features  based  on  the  texture  analysis  of  the  region  of  interest,  and  finally  match 
the  extracted  features  using  artificial  neural  networks. 

l.J^  Scope 

Computer  algorithms  will  be  developed  for  the  detection  and  classification  of  micro¬ 
calcifications.  Microcalcifications  are  generally  the  most  difficult  sign  of  breast  cancer  to 
detect  as  compared  to  other  signs  such  as  masses  or  tumors.  Microcalcifications  are  also 
one  of  the  first  mammographically  detectable  manifestations  of  cancer. 

The  Focus  Of  Attention  (FOA)  algorithms  will  be  based  on  image  differencing  tech¬ 
niques.  Work  by  Chan,  et  a/.  [8]  has  demonstrated  the  potential  of  this  technique.  Their 
technique  will  be  augmented  by  preprocessing  the  image  to  increase  the  dynamic  range  of 
the  pixel  values  where  most  of  the  microcalcification  information  is  found.  The  goal  of  the 
FOA  stage  will  be  to  retain  at  least  90%  of  the  known  cancerous  regions  while  reducing 
the  total  number  of  pixels  to  be  further  examined  by  at  least  80%. 

Indexing  will  be  accomplished  by  thresholding  the  FOA  regions  of  interest  (ROIs) 
based  on  texture  energy  ratios  and  the  number  of  identified  microcalcifications  in  the  ROI. 
Regions  passed  by  the  Indexing  stage  will  be  assumed  to  possibly  contain  micro  calcifica¬ 
tions.  Once  this  initial  hypothesis  is  generated,  a  set  of  features  will  be  extracted  from  the 
regions  of  interest  to  be  matched  against  predicted  features.  The  predicted,  features  will 
be  developed  from  training  data  used  during  initial  development  of  the  system. 

The  features  to  be  extracted  will  be  a  function  of  second  order  histogram  statistics 
and  image  texture  analysis.  The  second  order  histogram  features  were  based  upon  previous 
breast  cancer  research[17,  9].  The  image  texture  analysis  will  be  based  on  the  use  of  the 
Laws  Texture  measures[30]  and  Power  Spectrum  Analysis[41]. 

The  extracted  and  predicted  feature  sets  will  be  matched  using  neural  networks. 
The  LNKnet  software  available  here  at  AFIT  will  be  used.  A  number  of  classification 
techniques  are  available  in  LNKnet  including  K  nearest  neighbor,  Gaussian  and  Multi- 
Layer  Perceptron  (MLP)  neural  networks[19].  A  neural  network  will  also  be  developed  to 
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evaluate  the  effects  of  training  on  an  imbalanced  training  feature  set,  or  a  set  where  one 
class  has  a  much  larger  number  of  samples  available  than  the  other. 


1.5  Overview 

Chapter  I  presented  the  basis  for  applying  computer  vision  techniques  to  solving 
the  breast  cancer  detection  problem.  Chapter  II  provides  background  information  on 
breast  cancer,  computer  vision  and  related  breast  cancer  research.  Chapter  III  provides 
methodology  of  the  specific  techniques  used  in  this  research.  Details  on  the  database 
of  mammograms  used  and  analysis  of  the  research  are  presented  in  Chapter  IV.  Final 
results  and  conclusions  pertaining  to  this  research  are  given  in  Chapter  V.  Additional 
database  information  and  computer  code  developed  during  this  research  are  provided  in 
the  appendices. 
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II.  Background 


2. 1  Breast  Cancer 

The  sign  of  breast  cancer  focused  on  for  this  research  can  be  identified  in  a  mam¬ 
mogram  by  small  worm-like  deposits  of  calcium,  called  microcalcifications.  It  is  important 
to  note  that  calcifications  are  a  normal  occurrence  in  breast  tissue.  These  are  referred  to 
as  benign  calcifications.  A  radiologist  will  make  a  preliminary  diagnosis  from  a  mammo¬ 
gram  as  to  the  type  of  calcification  using  criteria  similar  to  those  in  Table  2.1  [16].  Most 
calcifications  will  have  characteristics  from  both  the  benign  and  malignant  criteria  and  the 
radiologist  will  have  to  determine  the  importance  of  each  feature  to  classify  the  lesion  as 
more  likely  to  be  malignant  or  benign. 


Criteria 

BENIGN 

MALIGNANT 

Size 

> 0.5mm  in  diameter 

0.1-0. 5  mm  in  diameter 

Density 

<5  in  1ml  vol 

>5  in  1ml  vol 

Appearance 

Regular,  smooth  shape 

Large  and  thick 

Diffusely  scattered,  both  breasts 

Irregular  shape,  pointed  edges 
Small  and  Thin 

Local  concentration,  one  breast 

Table  2.1  Criteria  for  Diagnosis  of  Microcalcifications[16] 


A  radiologist  may  also  consider  any  risk  factors  that  are  associated  with  the  patient 
while  making  a  diagnosis.  Age,  family  history  and  social  status  are  factors  that  may  be 
indicators  of  patients  more  likely  to  have  malignant  lesions.  However,  these  indicators 
need  to  be  used  with  care,  as  the  American  Cancer  Society  estimates  that  75%  of  breast 
cancers  occur  in  women  with  no  high  risk  factors[l].  Table  2.2  contains  an  excerpt  from  a 
list  of  common  risk  factors  as  compiled  by  Tanne[40]. 

Once  a  suspicious  region  is  detected,  a  biopsy  is  normally  performed  to  determine 
whether  the  lesion  is  malignant  or  benign.  The  biopsy  sample  is  forwarded  to  a  pathologist 
to  make  gross  (visible  to  the  naked  eye)  and  microscopic  examinations  of  the  sample. 
Appendix  A  contains  a  breakdown  of  the  number  of  malignant  and  benign  cases  used  in 
this  study. 
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Risk  Level 


Risk  Factor 


Criteria 


I_ L 


Significantly  higher  risk 

Age 

Country  of  Birth 

Family  medical  history 

50  or  older 

North  America 
northern  Europe 

Mother  or  sister  with 
history  of  breast’ cancer 

Socioeconomic  status 

Upper  class 

Age  at  first  pregnancy 

30  or  older 

Moderately  higher  risk 

Personal  medical  history 

Previous  cancer  in  one  breast 
Benign  tumor  (fibroadenoma) 

Family  medical  history 

Mother  or  sister  with 
history  of  breast  cancer 

Martial  status 

Never  married 

Place  of  residence 

Urban;  Northern  United  States 

Race 

Caucasian  women  45  or  older 
African-American  women 
younger  than  age  40 

Slightly  higher  risk 

Duration  of  estrogen  exposure 

Menopause  after  age  55 
Menstruation  before  age  11 

Number  of  pregnancies 

None 

Weight 

Obesity  after  menopause 

Personal  medical  history 

Previous  endometrial 

or  ovarian  cancer 

Table  2.2  Risk  Factors  for  Breast  Cancer  in  Women[40] 
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It  is  hoped  that  computer-aided  diagnosis  can  assist  a  radiologist  in  detecting  sus¬ 
picious  regions  in  a  mammogram  and  possibly  provide  a  diagnosis  of  the  region  based  on 
digital  image  processing  techniques.  A  promising  methodology  being  developed  for  auto¬ 
matic  target  recognition  is  Model  Based  Vision(MBV)  [4] .  This  type  of  architecture  will 
be  used  for  developing  the  CADx  system  for  this  thesis. 

2.2  Computer-Aided  Diagnosis:  Model  Based  Vision 

The  Model  Based  Vision  architecture  is  based  on  developing  hypotheses  and  testing 
them  to  detect  and  identify  objects  of  interest  in  an  image.  The  MBV  approach  utilizes 
models  of  sensors,  targets  and  background  to  better  predict  the  characteristics  of  potential 
targets  that  can  be  determined  by  digital  image  processing.  The  following  provides  a  brief 
summary  of  the  stages  in  an  MBV  system  and  related  research  in  those  stages. 

2.2.1  Focus  of  Attention.  The  first  level  of  a  MBV  system  is  referred  to  as 
Focus  of  Attention(FOA).  This  stage  is  often  referred  to  as  segmentation.  The  purpose 
of  this  stage  is  to  eliminate  as  much  of  the  image  as  possible  that  obviously  does  not 
contain  something  of  interest.  For  this  research,  the  output  of  this  stage  consists  of  regions 
where  microcalcifications  may  be  present.  These  regions  are  referred  to  as  Regions  of 
Interest (ROI).  The  goal  of  this  stage  is  to  pass  all  regions  containing  microcalcifications, 
or  true  positives,  and  as  few  regions  as  possible  that  contain  normal  tissue,  or  false  ROIs. 

A  segmentation  technique  based  on  image  differencing  was  developed  by  Chan  and 
Nishikawa[27,  8,  7,  26].  The  process  is  based  on  filtering  the  image  twice.  Once  to  increase 
the  signal  to  noise  ratio  (SNR)  of  the  microcalcifications  as  compared  to  normal  tissue,  and 
the  second  time  to  decrease  the  SNR  of  the  microcalcifications.  The  images  are  differenced 
and  then  globally  thresholded  to  retain  only  the  pixels  with  values  at  the  high  end  of  the 
gray-level  histogram.  These  pixels  were  subjected  to  local  thresholding  which  retained  only 
pixels  with  gray  levels  in  the  original  image  that  were  greater  than  the  mean  plus  3.4  times 
the  standard  deviation  of  the  surrounding  51  by  51  pixel  window.  Finally,  morphological 
erosion  and  a  clustering  algorithm  are  applied  to  reduce  the  number  of  false  signals.  This 
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technique  yielded  85%  probability  of  detection  with  2  false  regions  per  image  when  applied 
to  a  set  of  78  mammograms. 

While  the  technique  developed  by  Chan  and  Nishikawa  is  dependent  on  local  contrast, 
Brettle,  et  al.  created  a  segmentation  scheme  that  operates  in  the  frequency  domain[6]. 
Operating  in  the  frequency  domain  allows  selected  frequency  components  to  be  modified 
independently  of  spatial  contrast.  The  original  image  is  converted  into  its  frequency  com¬ 
ponents  by  use  of  the  Fourier  Transform.  The  technique  then  utilizes  a  combination  of 
a  Butterworth  high  pass  filter  and  a  matched  filter  tuned  to  detect  structures  resembling 
microcalcifications.  The  resulting  image  is  spatially  filtered  to  remove  noise  and  globally 
thresholded  to  retain  only  pixels  above  some  multiple  of  the  standard  deviation  in  the  im¬ 
age.  Brettle  applied  this  technique  to  15  segmented  regions  and  achieved  100%  probability 
of  detection  with  a  false  positive  rate  of  4  calcifications  per  region.  It  should  be  noted  that 
this  technique  was  not  applied  to  an  entire  image,  only  a  small  portion  of  a  full  image. 
This  research  will  be  processing  the  entire  breast  image. 

Yoshida,  et  al.  implemented  a  set  of  Least  Asymmetric  Daubechies  (LAD)  wavelets 
for  the  automated  detection  of  clustered  microcalcifications  [42].  Their  preliminary  results 
using  a  database  of  39  mammograms  with  41  microcalcification  clusters  yielded  a  detection 
rate  of  85%,  with  a  false  positive  rate  of  5  clusters  per  image. 

2.2.2  Indexing.  The  indexing  module  creates  an  initial  hypothesis  space  which 
attempts  to  assign  some  identification  to  a  region  of  interest  in  an  image.  This  is  an 
overall  likelihood  or  confidence  measure  for  later  model-based  refinement.  Traditional 
target  recognition  schemes  do  not  include  this  stage,  opting  to  go  directly  to  the  next 
process  termed  feature  extraction. 

2.2.3  Feature  Extraction.  The  Feature  Extraction  phase  attempts  to  provide 
compact,  quantitative  descriptions  of  image  characteristics.  The  extracted  features  are 
matched  against  predicted  features  to  recognize  targets.  There  are  a  number  of  desirable 
properties  for  extracted  features]!!,  4]: 
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1.  Robust:  Reliably  found  in  imagery  and  stable  with  respect  to  small  image  changes, 
such  as  uncertainties  in  absolute  amplitude. 

2.  Discriminating:  Responsive  to  differences  among  targets.  A  trade-off  exists  between 
robustness  and  discriminating  power.  A  system  may  attempt  to  classify  a  region 
beginning  with  robust,  less  discriminating  features  then  use  less  robust,  highly  dis¬ 
criminating  features  to  establish  fine  distinctions. 

3.  Extractable:  Computable  from  image  data. 

4.  Predictable:  Derivable  from  3-D  models  and/or  a  priori  exemplars. 

5.  Efficient:  Low  computational  load  and  a  minimum  set  of  required  features. 

The  University  of  Chicago  has  obtained  encouraging  results  using  features  derived 
from  the  first  moment  of  the  power  spectrum  of  the  region[l3].  Chitre,  et  al.  and  Kocur 
have  made  use  of  features  derived  from  the  second  order  histogram  of  the  region  includ¬ 
ing:  Entropy,  Contrast,  Angular  Second  Moment,  and  Inverse  Difference  Moments[9,  18]. 
In  further  work,  Chitre  included  a  set  of  binary  cluster  features  (number  of  calcifications, 
average  distance  between  calcifications,  etc.)  in  addition  to  the  second  order  histogram  fea¬ 
tures  which  improved  the  classification  of  malignant  vs.  benign  regions[10].  A  combination 
of  shape,  texture  and  contrast  features  were  applied  to  images  containing  microcalcifica¬ 
tions  by  Parker,  et  a/.  [28].  Texture  features  have  also  been  used  to  discriminate  between 
glandular  and  fatty  regions  in  a  study  by  Astley  and  Miller  [23].  In  their  study,  the  images 
were  filtered  with  the  Laws  Texture  masks[30]  and  image  statistics  were  used  to  classify 
the  breast  tissue.  The  masks  found  to  be  most  useful  were  the  5x5  versions  of  the  edge  and 
spot  filters  (R5R5,  L5L5  and  S5R5)  in  discriminating  between  glandular  and  fatty  regions. 

In  research  accomplished  here  at  AFIT,  feature  extraction  techniques  have  focused 
on  three  main  areas:  second-order  histograms,  Karhunen-Loeve  transforms  and  wavelet 
transforms [17,  18].  Originally  developed  and  evaluated  for  military  and  face  recognition 
applications,  these  techniques  were  applied  to  breast  cancer  detection[25].  The  Angular 
Second  Moment  (ASM)  was  generated  from  the  co-occurrence  matrix,  or  second  order  his¬ 
togram.  In  this  study,  only  a  single  distance  vector  was  used  in  determining  the  ASM 
calculation  for  the  image.  The  Karhunen-Loeve  transform,  also  referred  to  as  principal 
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component  analysis,  attempted  to  determine  the  directions  of  maximum  variance  in  a 
given  feature  set.  Actual  pixel  values  from  malignant  and  benign  regions  of  interest  were 
used  as  the  feature  set.  The  final  area  of  research  applied  to  breast  cancer  was  wavelet 
decomposition.  Daubechies  and  biorthogonal  wavelet  decompositions  were  applied  to  the 
microcalcification  regions.  The  best  results  were  achieved  using  a  biorthogonal  wavelet 
decomposition,  obtaining  an  88%  correct  classification  rate  on  93  difficult  to  diagnosis 
images[17]. 

2.2. Prediction.  The  Prediction  stage  focuses  on  producing  quantitatively  cor¬ 
rect  signature  features  suitable  for  matching.  This  stage  may  include  producing  a  “model” 
of  a  region  of  interest  based  on  information  gained  from  the  Focus  of  Attention  and  Index¬ 
ing  stages.  This  model  will  attempt  to  simulate  a  target  in  the  appropriate  background 
based  on  image  information  and  will  have  the  same  features  extracted  as  the  candidate  re¬ 
gion  of  interest  for  use  in  the  matching  phase.  For  this  research,  the  prediction  module  will 
not  develop  models,  but  will  reference  training  data  regions  of  interest  that  are  consistent 
with  the  indexing  hypothesis. 

2.2.5  Matching.  Once  a  region  has  been  processed  by  the  FOA,  been  assigned 
an  initial  hypothesis,  and  the  desired  features  are  extracted  from  the  regions  of  interest, 
the  features  are  sent  to  a  classification  algorithm  in  an  attempt  to  verify,  or  match,  the 
predicted  hypothesis.  A  number  of  classification  schemes  have  been  developed  for  pattern 
recognition[ll].  Currently,  one  of  the  most  novel  classification  schemes  for  medical  imag¬ 
ing  is  the  multilayer  perceptron  (MLP)  artificial  neural  network[13,  9].  Neural  networks 
have  a  number  of  benefits  when  applied  to  cancer  detection  and  diagnosis[32].  A  neural 
network,  as  well  as  other  classifier  types,  can  be  evaluated  with  LNKnet,  a  versatile  classi¬ 
fication  program[19].  LNKnet  is  capable  of  evaluating  a  given  feature  set  using  a  number 
of  classifiers,  including  a  statistical  (Gaussian)  or  a  non-parametric  (K-Nearest  Neighbor) 
classifier. 

2.2.6  Search.  The  Search  module  evaluates  the  results  of  the  Prediction,  Feature 
Extraction,  and  Matching  process  to  determine  whether  or  not  an  acceptable  match  was 
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achieved.  The  search  module  will  examine  the  output  of  the  match  process  for  this  research. 
The  input  mammogram  images  will  either  contain  microcalcifications  or  will  consist  of  all 
normal  tissue.  Therefore,  if  a  region  identified  as  a  microcalcification  will  be  deemed  an 
acceptable  match. 


2.3  Feature  Selection 


In  any  pattern  recognition  problem,  it  is  desirable  to  classify  a  pattern  using  as 
few  features  as  possible  [11].  A  reduced  feature  space  lends  itself  to  less  computational 
requirements  and  better  generalization  to  unseen  data.  A  number  of  techniques  are  avail¬ 
able  to  attempt  to  determine  which  of  the  features  contain  the  most  relevant  classification 
information. 


A  simple,  statistical  measure  to  quantify  how  separable  a  feature  is  in  a  two  class 
problem  is  the  Fisher  Ratio,  Eqn  2.1,  where  Hi  and  are  the  mean  and  variance  of 
the  feature  set  for  class  i[29].  The  Fisher  Ratio  is  a  measure  of  the  separability  of  the 
Probability  Density  Function(PDF)  of  the  feature  for  each  class.  The  larger  the  Fisher 
Ratio,  the  more  separable  the  classes  are  for  that  particular  feature.  This  test  is  useful  for 
only  a  single  feature  vector  and  does  not  give  any  insight  into  the  effects  of  combinations 
of  features.  Still,  it  can  be  used  for  an  initial  determination  of  the  potential  classification 
ability  for  a  feature,  such  as  a  particular  distance  vector  used  to  generate  an  Angular 
Second  Moment  value. 


ial  +  al) 


(2.1) 


A  technique  has  been  developed  that  integrates  feature  and  neural  network  architec¬ 
ture  selection  by  Steppe  [38,  39].  The  Steppe  algorithm  uses  an  iterative  likelihood  ratio 
test  statistic  as  a  model  selection  criterion  for  sequentially  determining  the  “besF  neural 
network. 


The  Steppe  approach  is  a  combination  of  statistical  model  building  perspective  and 
backwards  sequential  selection.  The  process  begins  with  architecture  selection,  where  I 
versions  of  a  neural  network  with  N  hidden  nodes  and  M  features  are  trained  and  tested. 
Then,  the  same  number  of  neural  networks  are  trained  and  tested  with  N  —  I  hidden  nodes. 
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If  the  N  —  1  hidden  node  network  results  are  not  statistically  significantly  different  than 
the  N  hidden  node  networks,  the  reduce  network  is  retained.  Next,  feature  selection  is 
accomplished  where  I  versions  of  the  current  network  architecture  are  trained  and  tested 
with  M  features.  This  is  followed  by  I  networks  trained  and  tested  with  one  of  the  M 
features  removed.  This  is  done  until  each  feature  has  been  left  out.  The  feature  that 
causes  the  least  statistically  significant  change  in  results  is  eliminated  and  the  process 
of  architecture  selection  is  begun  again[39].  This  process  can  be  implemented  to  find  the 
smallest  architecture  and  the  single  “best”  feature  or  feature  subset  for  a  given  classification 
problem[17]. 

One  of  the  key  practical  considerations  is  the  necessary  computing  time  and  resources 
for  performing  architecture  and  feature  selection  on  a  given  data  set.  For  large  data  sets 
with  a  number  of  features,  the  training  of  multiple  neural  networks  for  each  architecture 
and  feature  set  requires  extensive  processing  time. 

Another  method  designed  specifically  for  neural  networks  is  a  derivative  based  saliency 
metric  developed  by  Ruck[34].  This  saliency  metric  determines  which  features  effect  the 
output  of  a  trained  neural  network  by  taking  the  derivative  of  the  output  with  respect  to 
each  input  feature.  The  features  having  the  most  effect  on  the  output  will  have  a  higher 
value.  This  is  done  by  training  multiple  neural  networks  and  averaging  the  saliency  value 
for  each  feature.  The  Ruck  method  is  much  faster  and  easier  to  implement  in  comparison 
to  the  Steppe  algorithm. 

2.4  Summary 

Research  in  the  area  of  pattern  recognition  and  breast  cancer  is  extensive.  A  number 
of  candidate  techniques  have  been  developed  and  evaluated  yielding  promising  results.  Yet, 
no  single  system  or  technique  will  be  able  to  correctly  identify  microcalcification  regions 
in  every  case.  The  solution  may  exist  in  having  a  number  of  techniques  processing  an 
image  and  combining  the  results.  It  is  the  focus  of  on-going  research  at  AFIT  to  develop 
and  analyze  new  techniques  for  use  in  diagnosing  breast  cancer.  These  techniques  are 
being  designed  to  be  implemented  in  a  Model  Based  Vision  architecture.  The  processes 
specifically  developed  in  this  research  are  presented  and  expanded  in  the  next  chapter. 
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III.  Methodology 


3. 1  Introduction 

This  chapter  describes  the  actual  techniques  used  to  discriminate  regions  containing 
microcalcifications  from  regions  of  normal  tissue. 

3.2  Database 

The  mammograms  used  in  this  research  were  obtained  from  the  Wright-Patterson 
Medical  Center,  Wright-Patterson  AFB.  A  total  of  72  patient  cases  were  selected  to  be 
digitized  providing  a  total  of  284  mammograms.  The  films  were  digitized  to  0.1  mm  by  0.1 
mm  pixel  size  with  12  bit  gray  scale  resolution(4096  gray  levels)  using  an  Lumiscan  200 
Laser  Film  Digitizer  and  Macintosh  computer.  The  system  was  calibrated  such  that  the 
optical  density  range  of  0  to  3.5  was  digitized  linearly  to  0.001  optical  density  unit/pixel 
value.  After  digitizing,  each  mammogram  was  manually  sized  to  1024  x  2048  pixel  images 
for  evaluation  with  the  CADx  system. 

Each  mammogram  had  a  corresponding  pathology  report  indicating  the  diagnosis 
and  location  of  suspected  regions.  Dr.  Jeff  Hoffmeister  reviewed  and  annotated  each  mam¬ 
mogram  as  to  the  location  and  type  of  abnormality,  if  any.  Table  3.1  shows  the  various 
types  of  tissue  abnormalities  and  the  corresponding  number  of  images  available  in  the 
database.  The  total  number  of  images  in  Table  3.1  exceeds  the  total  number  of  mammo¬ 
grams  digitized  as  some  images  contained  multiple  abnormalities. 


Abnormality 

Number  of  Images 

Biopsy  Proven  Malignant  Microcalcs 

39 

Benign  Microcalcs 

37 

Biopsy  Proven  Malignant  Masses 

48 

Benign  Masses 

53 

No  Abnormality  Visible 

140 

TOTAL 

284 

Table  3.1  Number  of  Images  Available  in  Database  for  Various  Tissue  Abnormalities 
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3.3  System  Overview 

This  section  provides  a  brief  overview  of  the  Microcalcification  Detection  System.  A 
Flow  Diagram  is  shown  in  Figure  3.1.  This  system  follows  the  basic  Model  Based  Vision 
architecture.  The  first  module  of  the  system,  Focus  of  Attention,  attempts  to  reduce  the 
amount  of  data  to  be  processed  by  the  system.  The  original  image  is  first  preprocessed 
to  improve  the  contrast  and  dynamic  range  of  the  microcalcifications  in  the  image  by 
remapping  the  gray  levels  with  a  sigmoidal  function.  This  modified  image  is  then  filtered 
with  a  Hit/Miss  technique.  The  filtered  image  emphasizes  microcalcification- like  structures 
in  the  image.  Regions  of  Interest,  ROIs,  are  identified  by  a  three  step  process.  First,  the 
filtered  image  is  globally  thresholded  to  retain  only  the  brightest  0.5%  of  the  pixels  in  the 
image.  Second,  the  original  image  is  locally  thresholded  by  finding  pixels  that  have  a  gray 
level  value  greater  than  the  mean  plus  two  times  the  standard  deviation  of  a  51  by  51  pixel 
box  around  the  pixel  of  interest.  Only  pixels  surviving  both  thresholding  techniques  are 
retained.  Finally,  the  center  coordinates  of  the  minimum  number  of  64  by  64  pixel  ROIs 
enclosing  the  retained  pixels  are  determined  through  a  process  of  ROI  centroid  migration. 

The  Regions  of  Interest  passed  by  the  Focus  of  Attention  module  are  next  processed 
by  the  Indexing  module.  This  module  forms  an  initial  hypothesis  as  to  the  type  of  tissue  in 
the  ROI.  Three  features  are  extracted  from  each  ROI  to  develop  this  hypothesis.  The  first 
feature  is  the  number  of  individual  microcalcifications  identified  in  the  ROI.  The  next  two 
features  are  Laws  Energy  Ratios,  LER,  for  each  ROI.  The  LER  is  the  ratio  of  the  energy 
in  the  microcalcifications  only  versus  the  total  energy  in  the  ROI  after  filtering  with  the 
L5E5  and  L5R5  Laws  Masks.  ROIs  having  at  least  3  individual  calcifications,  an  L5E5 
LER  >0.0287  and  an  L5R5  LER  >0.0083  are  given  the  initial  hypothesis  of  being  a  region 
of  microcalcifications.  These  ROIs  are  then  set  to  the  final  module,  Matching,  to  confirm 
the  hypothesis. 

The  Matching  module  takes  the  ROIs  passed  by  the  Indexing  stage  and  extracts  an 
additional  set  of  features  to  be  used  to  classify  the  tissue  type  as  normal  or  containing 
microcalcifications.  A  set  of  texture  features  based  on  Angular  Second  Moment  values, 
Power  Spectrum  Analysis  and  Laws  Texture  Measures  is  extracted  for  each  ROI.  A  neural 


3-2 


network  is  used  to  determine  if  the  extracted  features  best  match  to  tissue  containing 
microcalcifications  or  normal  breast  tissue. 

Again,  the  process  is  shown  as  a  Flow  Diagram  in  Figure  3.1.  Each  module  and  the 
steps  contained  with  in  that  module  is  shown.  The  remaining  sections  will  describe  in 
detail  each  module  and  the  processing  involved. 

3.4  Focus  of  Attention 

3.4-1  Overview.  The  first  step  in  processing  the  mammogram  image  is  Focus  of 
Attention  (FOA).  This  stage  is  often  referred  to  as  segmentation.  The  purpose  of  this  stage 
is  to  eliminate  as  much  of  the  image  as  possible  that  obviously  does  not  contain  something 
of  interest.  The  output  of  this  stage  consists  of  regions  where  microcalcifications  may  be 
present.  These  regions  are  referred  to  as  Regions  of  Interest  (ROIs).  The  goal  of  this  stage 
is  to  pass  all  regions  containing  true  abnormalities,  or  true  positives,  and  as  few  regions  as 
possible  that  contain  normal  tissue,  or  false  ROIs. 

There  are  three  steps  in  the  FOA  module  for  this  system.  The  image  is  first  pre- 
processed  to  modify  the  gray  levels  in  an  attempt  to  improve  microcalcification  contrast 
and  dynamic  range.  The  processed  image  is  filtered  using  a  Hit/Miss  filtering  technique 
to  identify  pixel  locations  that  represent  potential  microcalcifications.  The  filtered  im¬ 
age  is  next  subjected  to  a  global  and  local  thresholding  scheme.  The  image  is  globally 
thresholded  to  retain  only  a  percentage  of  the  brightest  pixels.  Those  pixel  locations  are 
further  evaluated  by  local  thresholding  those  locations  in  the  original  image  to  determine 
if  they  are  greater  than  the  mean  and  some  multiple  of  the  standard  deviation  of  a  small 
window  around  the  region.  Finally,  regions  of  interest  are  found  by  grouping  surviving 
pixel  locations  to  retain  the  minimum  number  of  64  by  64  pixel  regions. 

3.4.2  Gray  Level  Modification.  After  examining  a  number  of  sample  mammo¬ 
grams  containing  microcalcifications  from  the  Training  Data  Set,  it  was  discovered  that 
most  of  the  gray  levels  containing  microcalcification  information  were  in  the  range  of  2200 
to  3600.  A  sample  image  and  it’s  histogram.  Figures  3.2(a)  and  (b),  provide  an  exam¬ 
ple  of  how  the  pixel  gray  levels  associated  with  background  and  microcalcifications  are 
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Figure  3.1  Flow  Diagram  for  Microcalcification  Detection  System 
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Figure  3.2  (a)  Sample  Mammogram  Image(b)  Histogram  of  the  Image 


distributed.  A  non-linear  function  is  applied  to  the  raw  image  to  remap  the  gray  levels 
of  interest  such  that  they  occupy  a  larger  range  of  the  available  gray  levels.  Figure  3.3 
illustrates  the  sigmoidal  function  used  to  remap  the  gray  levels  and  the  resulting  image. 

The  non-linear  mapping  has  two  desirable  effects: 


•  The  dynamic  range  of  the  microcalcifications  regions  is  increased  which  also  yields 
improved  contrast  of  the  microcalcifications  as  compared  to  the  surrounding  back¬ 
ground.  To  illustrate  the  increase  in  dynamic  range,  a  small  region  containing  mi¬ 
crocalcifications  from  fourteen  mammograms  was  extracted  from  the  original  and 
processed  images.  The  dynamic  range  and  contrast  was  calculated  for  the  regions. 
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Output  Gray  Level 


Sigmoid 


500  1000  1500  2000  2500  3000  3500  4000 
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Figure  3.3  (a)  Linear  vs.  Non-Linear  Gray  Level  Mapping 

(b)  Effect  of  Non-Linear  Mapping  to  Mammogram  in  Figure  3.2(a) 
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Table  3.2 


Image 

Original 

Processed 

Improvement 


Dynamic  Range  Contrast 
745  ^  0.0463 

1733  0.2060 


Non-Linear  Gray  Level  Mapping  Improvement  to  Dynamic  Range  and 
Contrast  [21] 


The  dynamic  range  of  a  region  is  quantified  as  DR  =  Max  —  Min,  where  Max  is 
the  maximum  pixel  value  in  the  region  and  Min  is  the  minimum  pixel  value.  The 
contrast  is  quantified  using  a  measure  defined  by  Morrow[24].  The  contrast  of  a 


region  is  found  by 


C  = 


f-b 
f  +  b 


where  /  is  the  mean  value  of  the  microcalcification  pixels  and  b  is  the  mean  value 
of  the  remaining,  or  background,  pixels.  Table  3.2  shows  the  Dynamic  Range  and 
Contrast  improvements  for  the  sample  regions.  The  non-linear  mapping  improved 
the  Dynamic  Range  by  approximately  2.5  and  had  over  a  factor  of  4  increase  in 
contrast  for  microcalcification  regions. 


•  The  structures  that  resemble  microcalcifications,  but  have  gray  levels  below  2200, 
are  effectively  removed.  This  helped  eliminate  a  number  of  false  ROIs  from  being 
passed  to  further  stages  in  the  Focus  of  Attention  process. 


3.4-3  Hit  and  Miss  Filtering.  A  Hit  and  Miss  thresholding  technique  used  in  the 
Focus  of  Attention  stage  is  modeled  after  the  system  developed  by  Chan  and  Nishikawa[27, 
8,  7].  This  technique  utilizes  two  filtered  versions  of  the  original  image.  The  first  filter,  the 
Hit  filter,  increases  the  signal  to  noise  ratio  of  structures  in  the  mammogram  that  resemble 
microcalcifications.  The  second  filter,  the  Miss  filter,  reduces  the  signal  to  noise  ratio  of 
those  same  structures.  A  differenced  image  is  obtained  by  subtracting  the  Miss  filtered 
image  from  the  Hit  filtered  image.  The  differencing  removes  the  majority  of  the  structured 
background  while  retaining  those  regions  resembling  the  targets  of  interest. 

The  Hit,  or  matched,  filter  used  is  the  three  by  three  kernel  shown  in  Figure  3.4(a). 
A  Box  Rim  filter,  shown  in  Figure  3.4(b),  is  used  as  the  Miss  filter  to  suppress  the  target 
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Figure  3.4  Spatial  Filters:  (a)  Hit  (matched);  (b)  Box  Rim  (suppression). 


signal.  Previous  work  by  Chan[8]  has  indicated  a  filter  with  an  outer  dimension  of  nine 
pixels  and  an  inner  dimension  of  five  pixels  yielded  the  best  performance.  Chan  performed 
his  work  on  100/xm  resolution  images,  the  same  resolution  as  the  AFIT  database,  with  a 
Free  Response  Operating  Curve  (FROC)  analysis  in  comparing  6  different  Hit/Miss  filter 
combinations. 


The  frequency  response  characteristics  of  the  filters  are  shown  in  Figure  3.5.  Through 
the  differencing  processing,  the  resulting  frequency  response  of  the  system  is  a  band  pass 
filter.  The  pass  band  of  approximately  0.15  to  0.45  in  normalized  frequency  (f  to  | 
radial  spatial  frequency)  indicates  structures  of  interest,  including  microcalcifications,  are 
composed  of  frequencies  in  this  range.  The  existence  of  microcalcifications  in  this  frequency 
range  corresponds  to  work  done  by  McCandless[21].  His  work  with  wavelet  decomposition 
also  indicated  a  range  of  |  to  |  contained  frequencies  common  to  microcalcifications. 


To  demonstrate  the  effects  of  the  Hit  &  Miss  filter.  Figure  3.6(a-d)  provides  a  look  at 
1-D  cross  sections  from  a  region  containing  a  microcalcifications  and  Figures  3.7(a-d)  are 
the  actual  regions.  This  sample  was  taken  from  image  AF055  and  has  a  mass  containing 
microcalcifications.  Figure  3.6(a)  shows  the  original  region  with  the  micro  calcification. 
Figures  3.6(b)  &  (c)  show  the  corresponding  region  after  applying  the  filters.  Figure  3.6(d) 
shows  the  differenced  signal.  The  same  sequence  but  with  the  full  region  is  shown  in  Figures 
3.7(a-d).  Note  how  the  background  mass  structure  has  been  reduced  to  gray  scale  levels 
near  zero,  causing  the  microcalcifications  to  be  easily  thresholded.  Defining  the  Signal  to 


Noise  Ratio  as  the  mean  value  divided  by  the  standard  deviation[l4],  SNR 


mean 


the  SNR  of  the  original  image  was  0.0157,  hit  filtered  image  -  0.0373,  miss  filtered  image  - 


0.0249,  and  the  differenced  image  -  0.3464.  The  overall  effect  on  the  sample  mammogram 
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Figure  3.5  Filter  Frequency  Response  for  Hit  and  Miss  Filter  with  Resulting  Difference. 

from  Figure  3.2(a)  is  shown  in  Figure  3.8.  Notice  how  the  background  has  been  effectively 
removed  while  the  microcalcifications  have  been  made  more  prominent.  This  is  also  evident 
in  the  histogram  of  the  image,  Figure  3.8(b),  as  the  microcalcifications  are  now  comprised 
of  the  brightest  pixels  in  the  image. 

3.4-4  Region  of  Interest  Extraction.  Once  the  differenced  image  is  obtained, 
global  thresholding  is  applied  to  retain  only  a  percentage  of  pixels  with  high  gray  scale 
values.  The  histogram  of  the  differenced  image  is  used  to  identify  the  gray  scale  value 
where  only  0.5%  of  the  pixels  have  higher  values.  The  pixels  that  are  higher  than  the 
threshold  are  set  to  one,  otherwise  the  pixels  are  set  to  zero.  This  produces  a  binary  mask 
image  of  potential  microcalcifications.  This  binary  image  is  then  subjected  to  a  clustering 
algorithm  that  identifies  groups  of  connected  pixels.  Only  groups  that  contain  between 
3  and  45  pixels  are  retained.  This  will  eliminate  any  small  or  large  pixel  groups  that 
correspond  to  noise  or  other  artifacts  in  the  image.  This  image  is  later  used  to  extract  the 
microcalcification  masks  required  to  generate  the  texture  energy  ratios  and  to  determine 
the  number  of  clusters  for  each  ROI  for  the  Indexing  and  Matching  modules. 

Then,  each  of  the  remaining  pixels  is  processed  with  a  local  threshold.  For  each 
candidate  pixel,  a  51x51  window  is  extracted  from  the  original  image.  The  pixel  g{x,y)  is 
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(c)  Signal  Filtered  with  Miss  Filter  (d)  Differenced  Signal 

Figure  3.6  1-D  Cross  Section  of  Effects  of  Hit  &  Miss  Filters 


(a)  Original  Region  (b)  Region  Filtered  with  Hit  Filter 


(c)  Region  Filtered  with  Miss  Filter  (d)  Differenced  Region  . 

Figure  3.7  Effects  of  Hit  &:  Miss  Filters  on  Microcalcification  Region 
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Figure  3.8  (a)  Mammogram  after  Hit/Miss  Filtering 

(b)  Histogram  of  Filtered  Mammogram. 
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(a)  (b)  (c) 

Figure  3.9  Binary  Mask  Developed  from: 

(a)  Hit/Miss  Thresholding 

(b)  Local  Thresholding 

(c)  Logically  “AND”  the  Two  Masks  Together 


retained  only  if 


g{x,y)  >  n  +  na 


where  ji  is  the  mean  value  of  the  local  window,  a  is  standard  deviation  of  the  window,  and 
n  is  the  threshold  factor. 

The  masks  developed  during  the  thresholding  process  can  be  seen  in  Figure  3.9(a- 
c).  The  first  mask  is  the  result  of  globally  thresholding  the  Hit/Miss  filtered  image.  The 
second  mask  is  the  result  of  the  local  thresholding  process.  By  logically  “AND”ing  the  two 
mask  together,  only  the  pixel  locations  common  to  both  masks  are  retained.  This  image 
is  used  for  ROI  selection. 
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The  minimum  number  of  64x64  boxes  that  enclose  the  surviving  pixels  is  next  de¬ 
termined.  This  is  accomplished  by  first  finding  512  non-overlapping  windows  in  the  image 
(16  high  by  32  wide).  The  center  of  mass  of  each  window  is  calculated  and  the  window 
is  recentered  around  that  point.  This  process  continues  until  the  window  moves  less  than 
2  pixels.  Overlapping  windows  are  eliminated  by  comparing  the  center  of  mass  of  each 
window.  If  the  center  of  masses  of  the  two  windows  are  within  d  =  20,  where  d  is  the 
Euclidean  distance  between  two  window  centers,  the  window  with  the  lowest  energy  is 
eliminated.  A  list  of  ROI  center  coordinates  is  now  generated. 

At  this  point,  the  ROIs  are  ranked  based  on  the  number  of  pixel  locations  that  cor¬ 
respond  to  potential  microcalcifications.  The  number  of  “on”  pixels  for  each  ROI  location 
in  the  binary  mask  is  calculated.  True  microcalcification  ROIs  generally  have  a  number  of 
pixel  locations  identified  by  the  Hit/Miss  filtering  process  as  compared  to  random  noise  or 
structures  that  responded  to  the  filtering. 

3.5  Indexing 

3.5.1  Overview.  The  Indexing  module  receives  a  list  of  potential  microcalcifica¬ 
tion  regions  as  identified  during  the  Focus  of  Attention  stage.  The  indexing  module  forms 
an  initial  hypothesis  as  to  the  classification  of  each  ROI.  In  this  case.  Indexing  attempts 
to  further  sort  out  the  ROIs  with  microcalcifications  from  those  containing  only  normal 
tissue.  In  this  stage,  three  features  are  extracted  from  each  ROI:  number  of  individual 
calcifications  and  two  Laws  Energy  Ratios  developed  from  filtering  the  ROI  with  a  Laws 
mask. 


3.5.2  Indexing  Feature  Extraction.  The  first  feature  extracted  is  the  number 
of  individual  calcifications  as  detected  by  the  Hit/Miss  filtering  operation.  An  ROI  is 
extracted  from  the  binary  image  produced  by  globally  thresholding  the  Hit/Miss  filtered 
image  for  each  coordinate  passed  by  the  FOA  module.  ROIs  containing  microcalcifications 
generally  have  a  large  number  of  individual  calcifications.  This  relates  to  the  information 
used  by  a  radiologist  in  diagnosing  a  region  containing  microcalcifications.  Recall  Table 
2.1  which  showed  regions  of  malignant  microcalcifications  generally  contain  5  or  more 
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Figure  3.10  Laws  Masks  Used  for  Indexing:(a)  L5R5(b)  L5E5 

individual  calcifications  in  a  1  ml  volume.  The  ROIs  are  64  x  64  pixel  regions  with  100/Lim 
pixels  which  gives  a  6.4mm  by  6.4mm  size.  For  a  volume  of  (6.4mm)®,  a  malignant  region 
of  this  size  would  generally  contain  more  than  1.31  individual  calcifications.  Based  on 
this  analysis  and  observations  during  system  development,  ROIs  are  required  to  have  at 
least  3  individual  calcifications  to  be  given  the  hypothesis  of  being  a  region  containing 
microcalcifications. 

The  ROIs  from  the  FOA  next  have  two  Laws  Energy  Ratios,  as  described  in  detail 
in  Section  3.6.4,  calculated  using  the  binary  mask  used  to  determine  the  number  of  calcifi¬ 
cations  and  the  same  region  location  extracted  from  the  original  image.  Prom  the  original 
image  and  binary  mask  ROIs,  the  indexing  stage  determines  the  Laws  Energy  Ratio,  LER, 
for  the  L5E5  and  L5R5  Laws  masks  which  are  shown  in  Figure  3.10.  These  two  mask  were 
selected  during  system  development  for  their  discriminating  ability  between  regions  with 
microcalcifications  from  those  without  for  the  Training  Data  Set.  Only  regions  having  an 
L5E5  and  L5R5  LER  greater  than  a  threshold  determined  during  system  development  are 
hypothesized  to  contain  microcalcifications. 

3.5.3  Indexing  Criteria.  After  processing  the  Training  Data  Set  images  during 
system  development,  three  indexing  criteria  were  developed  as  shown  in  Table  3.3.  The 
first  criteria  is  ROIs  must  have  at  least  3  individual  calcifications.  For  the  Laws  Energy 
Ratios,  it  was  determined  that  a  L5R5  LER  of  0.0083  and  a  L5E5  LER  of  0.0346  or  greater 
was  appropriate  for  separating  microcalcifications  from  normal  tissue  in  the  Training  Data 
Set.  Any  ROI  meeting  this  criteria  is  assigned  an  initial  hypothesis  of  being  a  region  of 
microcalcifications.  These  regions  are  now  sent  to  the  Matching  Module  to  confirm  or 
reject  this  hypothesis. 
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Index  Feature 

Criteria 

Number  of  Clusters 

>3 

L5R5  LER 

>0.0083 

L5E5  LER 

>0.0346 

Table  3.3  Indexing  Features  and  Criteria 
3.6  Feature  Extraction 

3.6.1  Overview.  The  ROIs  given  an  initial  hypothesis  of  being  a  region  of  micro¬ 
calcifications  are  passed  to  the  Feature  Extraction  module  which  processes  the  region  in 
an  attempt  to  provide  a  quantitative  description  of  image  characteristics  that  can  be  used 
by  a  classifier  to  discriminate  between  microcalcification  and  normal  tissue  regions.  Three 
different  texture  metrics  are  examined  for  their  ability  to  extract  the  “diagnosis  essence” 
of  the  ROI: 

•  Angular  Second  Moment 

•  Power  Spectrum  Analysis 

•  Laws  Energy  Ratios 

Each  technique  is  discussed  in  detail  in  the  following  sections. 

3.6.2  Angular  Second  Moment.  Angular  Second  Moment,  ASM,  is  a  measure 
often  used  to  classify  images  based  on  texture  analysis.  The  ASM  value  is  based  on 
gray  level  co-occurances,  i.e.,  on  joint  probability  densities  of  pairs  of  gray  levels.  Let 
6  =  {Ax,  Ay)  be  a  vector  in  the  {x,y)  plane.  For  any  such  vector  and  image  f{x,y),  the 
joint  probability  density  of  the  pairs  of  gray  levels  that  occur  at  points  separated  by  6  can 
be  found.  This  joint  density  takes  the  form  of  a  matrix,  C«,  commonly  referred  to  as  the 
gray  level  co-occurance  matrix,  where  Cg{i,j)  is  the  probability  of  the  pair  of  gray  levels 
{i,j)  occurring  at  a  vector  separation  6.  The  co-occurance  matrix  is  m  by  m,  where  m  is 
the  number  of  possible  gray  levels. 

It  is  easy  to  compute  the  Cg  matrix  for  a  given  image  by  counting  the  number  of 
times  each  pair  of  gray  levels  occur  at  a  vector  separation  6  =  {Ax,  Ay),  where  Aa;  and  Ay 
are  integers.  The  following  example  illustrates  the  Cg  matrix  is  developed  for  6  =  (1,  0). 
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Figure  3.11  Co-occurance  Matrix  Example:(a)  Image(b) 


Weszka,  it  et  al.,  in  their  study  of  texture  measures  for  the 
point  out: 


Cs  for  6  =  (1, 0) 
classification  of  terrain, 


If  a  texture  is  coarse,  and  6  is  small  compared  to  the  sizes  of  the  texture 
elements,  the  pairs  of  points  at  separation  6  should  usually  have  similar  gray 
levels.  This  means  that  the  high  values  in  the  matrix  Cs  should  be  concentrated 
on  or  near  its  main  diagonal.  Conversely,  for  a  fine  texture,  if  6  is  comparable 
to  the  texture  element  size,  then  the  gray  levels  of  points  separated  by  6  should 
often  be  quite  different,  so  that  the  values  in  €$  should  be  spread  out  relatively 
uniformly.  Thus  a  good  way  to  analyze  texture  coarseness  would  be  to  compute, 
for  various  values  of  the  magnitude  of  6,  some  measure  of  the  scatter  of  the  Cs 
values  around  the  main  diagonal[41]. 

Similarly,  texture  directionality  can  be  analyzed  by  comparing  the  spread  measures  of  Cs 

for  various  directions  of  the  vector  S. 


ASM  = 


(3.1) 


The  Angular  Second  Moment  calculation  is  defined  in  Equation  3.1.  In  this  form, 
p{i,j)  is  defined  as 

p(i  j)  - _ _ 

E.EyCs{x,y) 

This  measure  is  smallest  when  each  p{i,j)  are  as  equal  as  possible  and  large  when  some 
elements  are  large  and  others  small,  such  as  when  the  values  are  largely  concentrated 
around  the  main  diagonal.  For  the  example  Cs  matrix  in  Figure  3.11,  the  ASM  value  is 
0.0972. 

Previous  work  by  Kocur[l7]  and  Chitre[9,  10]  classified  benign  and  malignant  micro¬ 
calcifications  using  texture  features,  specifically,  Angular  Second  Moment.  In  both  studies, 
only  a  single  value  of  6  was  used  in  constructing  the  Cs  matrix.  A  better  representation 
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of  the  true  texture  present  in  the  microcalcification  region  of  interest  may  be  gained  by 
evaluating  multiple  values  of  6  in  order  to  determine  texture  coarseness  and  direction¬ 
ality.  This  measure  will  be  used  to  separate  normal  tissue  ROIs  from  ROIs  containing 
microcalcifications. 

3.6.3  Power  Spectrum  Analysis.  The  Fourier  transform  of  an  image  f{x,y)  is 
defined  by  Equation  3.2  and  the  Fourier  power  spectrum  is  |  F  FF*  (where  *  is  the 
complex  conjugate). 

F{u,v)  =  [  r  (*,]/)  dxdy  (3.2) 

J  J —oo 

The  radial  distribution  of  values  in  |  E  p  is  sensitive  to  texture  coarseness  in 
/(x, 2/)[41].  a  region  of  coarse  texture  will  have  high  values  concentrated  near  the  ori¬ 
gin,  while  fine  texture  regions  will  have  values  of  |  F  ^  more  spread  out.  A  method  to 
analyze  texture  properties  of  an  image  using  this  fact  is  to  find  the  averages  of  |  E  p  taken 
over  ring-shaped  regions  centered  at  the  origin,  as  given  by  Equation  3.3  for  various  values 
of  the  ring  radius  r,  where  r  =  ^/u'^  -|-  and  0  =  tan  ^[41]. 

cf>,=  \F(r,6)\U9  (3.3) 

Jo 

Since  the  regions  analyzed  in  this  research  are  n  by  n  digital  images,  the  discrete 
Fourier  transform  is  used  and  the  texture  features  from  the  power  spectrum, 
calculated  by  Equation  3.4. 

u^+v^<rl,  u,v<n—l 

^rur2  =  £  F(u,v)  (3.4) 

u^+t;2<rj,  u,t;>0 


Various  values  of  the  inner  and  outer  ring  radii  ri  and  r2  are  selected  to  correspond 
with  frequency  limits  of  various  size  objects.  For  the  64  by  64  ROIs  being  generated  by 
the  Hit/Miss  filtering  stage,  rings  investigated  and  the  corresponding  object  size  are  listed 
in  Table  3.4. 
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rur2 

Object  Size  (pixels) 

[0,1] 

32 

(1,2] 

16 

(2,4] 

8 

(4,8] 

4 

(8,16] 

2 

1 

Table  3.4  Inner  and  Outer  Ring  Radii  and  Corresponding  Object  Size 


Ring 

Micro  ROI 

Normal  ROI 

Mean 

Std 

Mean 

Std 

[0,1] 

0.5956 

0.0728 

0.6159 

0.0569 

(1,2] 

0.0241 

0.0098 

0.0230 

0.0096 

(2,4] 

0.0441 

0.0144 

0.0411 

0.0129 

(4,8] 

0.0705 

0.0206 

0.0582 

0.0133 

(8,16] 

0.1049 

0.0136 

(16,32] 

0.1410 

0.0199 

0.1480 

0.0186 

Table  3.5  Power  Spectrum  Ring  Ratios  for  a  Microcalcification  ROIs  and  a  Normal  ROIs 
from  14  Sample  Images 

Sample  regions  of  micro  calcifications  and  normal  tissue  with  their  corresponding 
power  spectrum  are  shown  in  Figure  3.12.  Notice  how  the  power  spectrum  of  the  micro¬ 
calcification  image  is  more  concentrated  in  the  low  frequency  values.  This  is  reflected  in 
the  ring  ratios  as  the  fraction  of  energy  in  the  lower  frequencies  is  higher  for  the  microcal¬ 
cifications,  as  shown  in  Table  3.5. 


3.6.4  Laws  Texture  Measures.  A  set  of  texture  features  based  on  the  correlation 
of  pixel  neighborhoods  with  a  set  of  standard  masks  was  developed  by  Laws[30].  The  masks 
are  derived  from  three  simple  vectors:  L3  [1  2  1],  EZ  [-1  0  1]  and  53  [-1  2  -1].  The  vectors 
represent  one-dimensional  operations  of  center-weighted  local  averaging,  symmetric  first 
( “edge  detection” )  and  second  ( “spot  detection” )  differencing.  By  convolving  these  vectors 
with  themselves  and  each  other,  five  vectors  are  developed  which  are  listed  in  Table  3.6. 

By  taking  the  outer  product  of  every  combination  of  vectors,  twenty  five  5x5  texture 
“masks”  are  created.  Each  mask  is  convolved  with  an  image  and  the  statistics  of  the 
resulting  image,  such  as  the  sums  of  the  squared  or  absolute  values  of  each  pixel,  is  used 
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Label  Result  of  Vector  Description 
L5  L3  *  L3  [1  4  6  4  1]  Local  Average 

55  E3  *  E3  [-1  0  2  0  -1]  Spot  Detector 

R5  S3  *  S3  [1  -4  6  -4  1]  Ripple  Detector 
E5  L3  *  E3  [-1  -2  0  2  1]  Edge  Detector 

W5  E3  *■  S3  [-1  2  0-2  1]  Wave  Average 

Table  3.6  Laws  Texture  Vectors 
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ROI 

Type 

L5E5  LER 

mean 

std 

Microcalcification 

0.0842 

0.0498 

Normal  Tissue 

Table  3.7  Laws  Energy  Ratios  for  Micro  and  Normal  ROIs  with  L5E5  Mask 

to  define  the  texture  properties  of  the  image.  This  results  in  a  texture  energy  measure  of 
the  image. 

For  this  research,  all  twenty  five  masks  are  investigated  to  determine  which,  if  any, 
respond  strongly  to  regions  containing  microcalcifications  while  having  little  effect  on  nor¬ 
mal  tissue  areas.  The  Laws  features  are  calculated  for  regions  detected  by  the  Hit/Miss 
filtering.  A  ratio  of  texture  energy  is  calculated  for  each  region  of  interest  and  each  texture 
mask.  This  ratio  is  defined  in  Eqn  3.5,  where  LER  is  the  Laws  Energy  Ratio,  Euicros  is 
the  energy  in  the  laws  filtered  image  corresponding  to  the  possible  microcalcifications,  and 
Erotai  is  the  total  energy  in  the  laws  filtered  image. 

LER  =  (3.5) 

^Total 

Emictos  is  determined  by  summing  only  the  pixel  values  in  the  ROI  that  correspond 
to  the  pixels  in  the  binary  mask  developed  during  the  FOA  module.  E^otai  is  the  sum  of 
all  pixel  values  in  the  ROI. 

Figure  3.13  shows  the  results  of  filtering  two  ROIs  with  the  Laws  mask  L5E5.  The 
center  images  are  the  binary  mask  showing  the  areas  corresponding  to  possible  microcal¬ 
cifications  as  detected  by  the  FOA  module.  Notice  how  the  filtered  image  of  the  microcal¬ 
cifications  have  the  majority  of  the  energy  concentrated  in  the  areas  found  in  the  binary 
masks.  This  results  in  a  high  LER.  The  false  ROI  filtered  images  have  energy  more  evenly 
distributed  throughout  the  image  which  results  in  a  lower  LER.  The  mean  and  standard 
deviation  for  the  L5E5  LER  for  the  ROIs  identified  in  the  Training  Data  Set  are  listed  in 
Table  3.7. 
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(d)  (e)  (f) 

Figure  3.13  Microcalcification  Tissue:  (a)  ROI,  (b)  Binary  Mask,  (c)  L5E5  Filtered  ROI 
Normal  Tissue:  (d)  ROI  (e)  Binary  Mask,  (f)  L5E5  Filtered  ROI 

3. 7  Prediction 

3.7.1  Overview.  The  Prediction  Module  in  a  Model  Based  Vision  System  pro¬ 
duces  quantitatively  correct  “signature”  features  suitable  for  matching.  These  features  are 
used  to  match  those  obtained  by  the  Feature  Extraction  module.  For  this  research,  the 
prediction  module  does  not  develop  a  model,  but  references  features  obtained  during  sys¬ 
tem  development  from  training  data.  These  features  are  used  to  train  the  neural  network 
used  in  the  Matching  module.  From  known  microcalcification  and  normal  tissue  regions, 
the  three  different  texture  measures  (ASM,  Power  Spectrum  Analysis,  and  Laws  Energy 
Ratios)  are  calculated.  This  results  in  a  total  of  56  different  features  for  each  training 
region.  In  an  effort  to  reduce  the  training  feature  space,  feature  selection  is  done  based  on 
Fisher  Ratio  analysis. 

3.7.2  Feature  Selection.  In  any  pattern  recognition  problem,  it  is  desirable  to 
reduce  the  number  of  features  used  in  classifying  a  set  of  data.  This  reduces  computational 
requirements  while  usually  improving  the  generalization  of  the  classifier.  The  trick  is  to 
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find  out  which  of  the  available  features  are  the  discriminatingly  relevant  features,  that  is, 
best  separate  one  class  from  another.  The  Fisher  Ratio  is  a  simple,  statistical  measure  to 
quantify  the  separation  of  two  classes  for  a  single  feature.  Recall,  the  Fisher  Ratio  is  given 

by 

(al  +  al) 

For  each  feature,  the  F-ratio  is  calculated.  The  features  with  the  highest  F-ratios  are  used 
for  matching.  The  number  of  features  that  can  be  used  are  determined  in  the  next  section. 
Matching. 

3. 8  Matching 

3.8.1  Overview.  ROIs  surviving  the  Focus  of  Attention  and  Indexing  stage 
are  assigned  an  initial  hypothesis  of  being  a  region  of  microcalcifications.  The  Matching 
Module  attempts  to  confirm  or  reject  this  hypothesis  by  using  the  information  provided  by 
the  Feature  Extraction  and  Prediction  Modules  to  discriminate  between  microcalcification 
and  normal  tissue.  The  features  used  by  the  classifier  are  selected  based  on  the  Fisher  Ratio 
calculation  in  an  attempt  to  identify  the  more  discriminatingly  relevant  features.  These 
features  are  used  by  a  single  hidden  layer  neural  network  to  perform  the  classification.  The 
neural  network  is  trained  using  a  modified  backpropagation  algorithm  to  reduce  training 
time.  The  following  sections  review  in  detail  the  methods  used. 

3.8.2  Classification.  A  single  hidden  layer  neural  network  with  one  output  node, 
as  shown  in  Figure  3.14,  is  used  for  classifying  the  ROIs  using  the  extracted  features.  The 
neural  network  is  trained  using  a  batch  backpropagation  algorithm  to  adjust  the  weights. 
The  network  outputs  are  clamped  to  1  —  e  for  any  value  greater  than  1  —  e  and  to  e  for 
values  less  than  e  during  training  to  reduce  the  likelihood  of  the  network  getting  stuck  in 
a  local  minima[37]. 

The  number  of  input  nodes,  I,  is  the  number  of  features.  This  value  is  determined 
using  Foley’s  Rule[31]  which  requires  at  least  three  times  the  number  of  training  samples  per 
class  for  each  feature.  Since  there  are  only  18  microcalcification  samples  in  the  training  set, 
a  maximum  of  6  features  are  used.  The  number  of  hidden  nodes,  L,  allowed  is  determined 
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using  Cover’s  Rule[31]  which  states 

^"TTT 

where  N  is  the  number  of  samples  in  the  training.  With  99  samples  in  the  training  set  and 
6  features,  this  yields  a  maximum  of  approximately  7  hidden  nodes.  Foley’s  and  Cover’s 
rules  give  a  good  starting  place  as  to  the  proper  architecture  for  a  neural  network,  but  are 
not  set  in  stone.  An  architecture  exceeding  these  values  can  be  used,  if  an  independent 
test  set  is  held  out  to  verify  the  neural  net  performance. 

Prom  the  Prediction  Module  Feature  Selection,  the  top  6  features  based  on  F-ratio 
analysis  are  used  for  training  and  testing  of  the  neural  network.  To  examine  the  effects  of 
various  architectures,  the  number  of  hidden  nodes  is  varied  from  1  to  9.  Two  data  sets. 
Evaluation  and  Normal  Data  Sets,  are  with  held  to  verify  the  classification  performance 
of  the  Matching  Module. 


3.8.3  Modified  Backpropagation  Algorithm.  One  of  the  difficulties  in  applying  a 
classification  scheme  to  the  breast  cancer  problem  is  the  lack  of  samples  in  one  or  both 
classes.  There  are  generally  a  larger  number  of  normal  tissue  samples  than  abnormal.  This 
is  a  major  disadvantage  for  a  backpropagation  trained  neural  network,  as  the  convergence 
of  the  net  output  error  is  very  slow  [5].  This  occurs  when  the  negative  gradient  vector 
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computed  by  backpropagation  actually  increases  the  error  for  the  subordinate  class  during 
the  initial  iterations. 

A  solution  to  this  problem  is  to  calculate  a  direction  in  the  weight  space  that  is 
downhill  in  both  the  dominant  and  subordinate  classes.  Anand,  et  al.[5],  recommend  finding 
a  descent  vector  v  which  satisfies  Equation  3.6. 

-vVEciW)  <0,  c  =  l,2  (3.6) 

This  vector  takes  the  place  of  the  gradient  vector  in  the  backpropagation  algorithm,  Equa¬ 
tion  3.7,  where  W{k)  is  the  collection  of  weights  in  the  neural  network  at  the  beginning  of 
the  fcth  iteration.  A,  a  positive  constant,  is  the  learning  rate  and  VE{W)  =  ^  VE^c(Ty),  c  = 
1,2. 

Wik  +  1)  =  VE(jfc)  -  \VE{W)  (3.7) 

The  direction  of  v  is  set  to  bisect  the  angle  between  — V£'i(TT)  and  —VE2{W),  the  gra¬ 
dients  of  the  error  vector  for  class  1  and  2,  respectively.  This  is  accomplished  by  finding 
V  using  Eqn  3.8.  The  magnitude  of  v  is  set  to  be  the  same  magnitude  as  would  of  been 
computed  by  the  standard  backpropagation,  as  in  Equation  3.9. 

1  _  f  -VEi{W)  -VE2{W)  \ 

2  *  V II  -VEi(lT)  II  II  -VE2{W)  II ) 

II  a;  11  =  11  VEi(TE)  +  VE2{W)  ||  (3.9) 

This  modified  backpropagation  algorithm  is  used  to  train  the  neural  networks  in 
hopes  of  reaching  a  converged  network  more  rapidly  that  has  minimum  error  in  both 
classes. 

3. 9  Summary 

The  Model  Based  Vision  architecture  is  used  to  develop  the  microcalcification  detec¬ 
tion  system.  The  Focus  of  Attention  module  uses  a  Hit/Miss  filtering  technique  followed  by 
global  and  local  thresholding  to  select  possible  Regions  of  Interest  (ROIs).  The  Indexing 
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Module  uses  information  from  two  Laws  Energy  Ratios  and  the  number  of  individual  cal¬ 
cifications  in  the  ROI  in  assigning  the  initial  hypothesis.  The  Feature  Extraction  Module 
obtains  texture  features  using  three  different  techniques:  Angular  Second  Moment,  Laws 
Energy  Ratios,  and  Power  Spectrum  Analysis.  The  top  6  features  based  on  Fisher  Ratios 
determined  during  the  Prediction  Module  are  retained  for  use  in  the  Matching  Module. 
The  Matching  Module  uses  a  modified  backpropagation  algorithm  Multilayer  Perceptron 
Neural  Network  to  classify  the  ROI  as  containing  microcalcifications  or  normal  tissue.  The 
results  obtained  from  testing  on  the  AFIT  database  are  provided  in  the  next  chapter. 
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IV.  Analysis  and  Results 


4-1  Introduction 

The  microcalcification  system  was  developed  and  evaluated  using  three  separate  data 
sets.  The  first  data  set,  labeled  Training  Data  Set,  was  used  to  initially  develop  the 
system  and  determine  thresholding  levels  and  indexing  criteria  values  used  in  the  Focus 
of  Attention  and  Indexing  Modules.  The  second  data  set.  Test  Data  Set,  was  used  to 
verify  the  accuracy  of  the  thresholds  determined  during  training.  Analysis  of  the  results 
from  the  test  set  were  used  to  adjust  threshold  values  before  going  on  to  the  final  data 
set.  Once  all  parameters  and  thresholds  have  been  determined  using  the  training  set  and 
slightly  modified  to  improve  accuracy  on  the  test  set,  a  final  data  set,  the  Evaluation 
Data  Set,  was  used  to  verify  the  the  detection  capability  of  the  system  on  unseen  data. 
This  was  a  “sanity  check”  to  determine  if  the  system  was  over  tuned  to  the  data  used  for 
development.  The  results  from  the  Evaluation  Data  set  should  be  a  reasonable  indication  of 
the  performance  of  the  system  to  any  image  data  set.  An  additional  data  set.  Normal  Data 
Set,  made  up  of  images  with  no  radiologist  noted  abnormalities,  was  used  to  evaluate  how 
the  system  performs  for  images  containing  no  diagnosed  microcalcifications.  The  number 
of  images  and  true  regions  of  interest  for  each  data  set  is  listed  in  Table  4.1.  Additional 
details  concerning  the  data  sets  used  can  be  found  in  Appendix  A. 

4-2  System  Development:  Training  Data  Set 

4-2.1  Focus  of  Attention  Module.  The  Focus  of  Attention  module  was  initially 
evaluated  using  the  14  mammograms  making  up  the  Training  Data  Set.  Each  image  had 

Data  Set  Number  of  Images  Number  of  Microcalcification  Regions 


Training  14  18 


Testing  17  20 


Evaluation  12  16 


Normal  10  0 


Total  53  M 


Table  4.1  Number  of  Images  and  Microcalcification  Regions  for  Training,  Testing,  Eval¬ 
uation  and  Normal  Data  Sets 
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Figure  4.1  Sample  Images:  (a)  Full  Mammogram  (b)  Zoom  on  Microcalcification 


a  radiologist  noted  and  biopsy  confirmed  malignant  or  benign  microcalcifications.  The 
microcalcifications  in  each  image  varied  from  very  high  to  low  contrast  in  comparison  to 
the  surrounding  background.  Figure  4.1  provides  an  example  of  the  mammogram  images 
used  in  this  study  and  a  close-up  of  the  microcalcification  present  in  the  image. 

Each  training  image  was  first  processed  by  the  FOA  module  to  identify  the  proper 
thresholds  for  the  global  and  local  thresholding  stages.  Each  training  image  was  processed 
multiple  times  as  the  two  parameters  were  varied  independently  -  the  percentage  of  pixels 
passed  in  global  thresholding  and  the  multiplicative  factor  of  the  standard  deviation  in  the 
local  thresholding.  The  first  parameter  that  was  varied  was  the  top  percentage  of  pixels 
passed  by  the  global  thresholding  stage.  While  this  parameter  was  varied  from  0.2%  to 
0.5%,  the  multiplicative  factor  was  held  constant  at  a  value  of  2.0.  The  multiplicative  factor 
was  then  varied  from  1.0  to  2.5  as  the  top  percentage  of  pixels  was  held  constant  to  a  value 
of  0.3%.  Figures  4.2  and  4.3  show  the  results  obtained  from  the  14  test  images  presented 


Figure  4.2  Free  Response  Operating  Curve  for  Varying  Global  Threshold 


Figure  4.3  Free  Response  Operating  Curve  for  Varying  Local  Threshold 

as  Free  Response  Operating  Curves(FROC).  The  FROC  shows  the  percentage  of  correctly 
segmented  regions  versus  the  number  of  false  ROIs  per  image.  The  ideal  operating  point 
is  the  upper  left  corner  of  the  plot  which  indicates  the  correct  regions  are  being  identified 
with  a  minimal  number  of  false  regions  being  retained. 

By  allowing  the  top  0.5%  of  pixels  in  the  differenced  image  to  pass  the  global  thresh¬ 
olding  and  a  multiplicative  factor  of  2.0  in  the  local  thresholding,  100%  of  the  micro¬ 
calcification  regions  in  the  14  test  images  can  be  identified  with  approximately  45  false 
ROIs  per  image.  The  goal  of  this  stage  is  to  pass  all  of  the  potential  regions  on  to  the 
Indexing  module,  which  attempted  to  further  reduce  the  false  regions  while  retaining  the 
true  regions  containing  microcalcifications.  The  number  of  correct  regions  identified,  their 
ranking  based  on  number  of  “on”  pixels  in  the  binary  mask  ROI,  and  the  total  number  of 
regions  found  for  each  training  image  is  shown  in  Table  4.2.  Note  that  except  for  one  region 
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Image 

Number  of  Correct 
Regions 

Rank 

Total  Number 
of  Regions 

AF005 

2/2 

1,  2 

59 

AF006 

3/3 

1,  2,5 

66 

AF007 

2/2 

1,2 

49 

AF008 

1/1 

13 

17 

AF009 

1/1 

1 

21 

AF020 

1/1 

7 

39 

AF022 

1/1 

1 

66 

AF024 

1/1 

1 

38 

AF033 

1/1 

2 

41 

AF038 

1/1 

1 

75 

AF040 

1/1 

5 

33 

AF045 

1/1 

1 

35 

AF047 

1/1 

4 

25 

AF055 

1/1 

7 

30 

Total 

18/18 

3.17  (mean) 

594 

Table  4.2  Results  of  Focus  of  Attention  Module  using  Training  Data 


in  image  AF008  which  ranked  13th  out  of  17,  all  the  remaining  regions  were  ranked  within 
the  top  7  regions.  The  system  could  pass  only  the  top  7  ROIs  based  on  this  ranking  and 
have  an  acceptable  Probability  of  Detection  of  95.4%  and  an  average  False  ROI  Rate  of 
5.93  regions  per  image.  To  improve  this  performance,  the  Indexing  and  Matching  Modules 
are  used  to  reduce  the  False  ROI  Rate. 


4-2.2  Indexing  Module.  The  indexing  module  received  the  list  of  ROI  center 
coordinates  from  the  Focus  of  Attention  Module.  A  64  by  64  region  from  the  FOA  binary 
mask  and  the  original  image  was  extracted  for  each  of  the  coordinates.  The  binary  mask 
was  used  to  determine  the  number  of  individual  calcifications  in  each  ROI.  After  process¬ 
ing  the  14  training  images,  the  regions  containing  microcalcifications  had  at  least  three 
individual  calcifications  present.  This  was  assigned  as  the  first  indexing  criteria. 

For  the  ROIs  containing  at  least  three  individual  clusters,  each  ROI  from  the  orig¬ 
inal  image  was  filtered  with  each  of  the  25  Laws  masks.  The  Laws  Energy  Ratio,  LER, 
was  calculated  for  each  ROI/Laws  mask  combination.  This  ratio  determines  the  energy 
contained  in  the  individual  calcifications  versus  the  total  energy  in  the  ROI  filtered  by  the 
Laws  Mask.  To  determine  which  of  the  Laws  Energy  Ratios  had  the  strongest  response 
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Figure  4.4  Free  Response  Operating  Curves  using  Training  Data  for: 

(a)  Laws  Mask  L5E5 

(b)  Laws  Mask  L5R5 


to  the  microcalcifications,  a  FROC  analysis  was  done  for  each  of  the  25  LERs.  The  Laws 
Masks  L5E5  and  L5R5  had  100%  Probability  of  Detection  with  the  lowest  False  ROI  Rate 
for  the  14  training  images,  as  shown  in  Figures  4.4(a)  and  (b). 

The  Indexing  module  analysis  on  the  14  training  images  provided  a  first  attempt  at 
setting  the  proper  thresholds  for  the  LER  for  mask  L5E5  and  L5R5.  From  the  FROC 
analysis,  only  ROIs  with  L5E5  LER  of  greater  than  0.0346  and  an  L5R5  LER  of  greater 
than  0.0083  were  given  the  initial  hypothesis  of  being  a  region  of  microcalcifications.  For 
the  14  training  images,  this  resulted  in  a  100%  Probability  of  detection  and  an  average 
of  3.2  False  ROIs  per  image.  This  is  comparable  to  other  researchers  results.  Recall 
the  performance  achieved  by  the  system  developed  by  Chan[27]  which  obtained  an  85% 
Probability  of  Detection  rate  with  2  false  regions  per  image  and  Yoshida[42]  with  83%  Pd 
and  5  false  regions  per  image.  It  should  be  noted  that  how  Chan  and  Yoshida  divided 
their  data  into  training  and  testing  sets  is  unknown.  If  an  independent  test  set  was  not 
held  out,  their  results  may  be  biased  as  their  systems  could  of  been  over  tuned  to  their 
training  data. 
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Table  4.3  Fisher  Ratio  Values  and  Ranking  for  each  ASM  Feature 

4.2.3  Feature  Extraction  and  Prediction.  The  full  set  of  25  Laws  Energy  Ratios, 
25  Angular  Second  Moment  and  6  Power  Spectrum  Analysis  features  were  extracted  for 
each  ROI  passed  by  the  Indexing  Module  with  the  hypothesis  of  containing  microcalci¬ 
fications.  From  these  features,  the  Fisher  Ratios  were  calculated  to  determine  the  top  6 
features  from  each  feature  set  as  shown  in  Tables  4.3,  4.4,  4.5.  These  6  features  from  each 
feature  set  were  used  to  train  a  neural  network  for  a  comparison  study  to  determine  which 
of  the  texture  measures  give  the  best  performance. 

4.2.4  Matching.  For  each  texture  feature  set,  5  networks  with  1  to  9  hidden  nodes 
(a  total  of  45  networks  for  each  feature  set)  were  trained  using  the  imbalanced  training  set 
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Feature 

Set 

Laws 

Mask 

F-Ratio 

L5L5 

0.3019 

23 

L5S5 

0.7148 

2 

L5R5 

0.5956 

7 

L5E5 

0.6207 

3 

L5W5 

0.6118 

4 

S5L5 

0.5600 

9 

S5S5 

0.7348 

1 

S5R5 

0.4574 

15 

S5E5 

0.5858 

8 

S5W5 

0.5375 

10 

Laws 

R5L5 

0.4879 

12 

Energy 

R5S5 

0.4200 

17 

Ratios 

R5R5 

24 

R5E5 

0.3332 

20 

R5W5 

0.3105 

22 

E5L5 

0.6039 

5 

E5S5 

0.6025 

6 

E5R5 

0.3860 

19 

E5E5 

0.1118 

25 

E5W5 

0.4659 

14 

W5L5 

0.4833 

13 

W5S5 

0.4996 

11 

W5R5 

0.3124 

21 

W5E5 

0.4439 

16 

W5W5 

0.4046 

18 

Table  4.4  Fisher  Ratio  Values  and  Ranking  for  each  Laws  Energy  Ratio  Feature  Feature 


Feature 

Set 

Ring 

Radius 

F-Ratio 

Rank 

Power 

Spectrum 

Analysis 

ROl 

0.0480 

4 

R12 

6 

R24 

5 

R48 

0.2552 

2 

R816 

0.2641 

1 

R1632 

0.0650 

3 

Table  4.5 


Fisher  Ratio  Values  and  Ranking  for  each  Power  Spectrum  Analysis  Feature 
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Feature 

Set 

#  Hidden 
Nodes 

Probability  of 
Detection 

False  ROI 
Rate 

ASM 

4 

0.94 

5.21 

LER 

4 

0.94 

2.36 

PSA 

7 

0.94 

1.64 

Table  4.6  Training  Data  Set  System  Results  including  Matching  Module 


Parameter 

Value 

Global  Threshold 

0.5% 

Local  Threshold 

2.0 

Number  of  Clusters/ROI 

>3 

L5R5  LER 

>0.0083 

L5E5  LER 

>0.0346 

Table  4.7  Parameter  Settings  Determined  During  System  Development 

modified  backpropagation  algorithm  with  a  fixed  learning  rate  of  0.1.  Each  network  was 
trained  until  at  least  90%  of  the  training  set  microcalcifications  were  correctly  identified. 
The  results  from  testing  on  the  training  data  are  shown  in  Table  4.6.  These  results  are 
biased  since  the  network  was  trained  with  the  same  data  it  was  tested  with,  naturally 
causing  a  high  Probability  of  Detection.  System  evaluation  with  the  Test,  Evaluation  and 
Normal  Data  sets  will  give  a  better  representation  of  neural  network  performance. 

4-3  System  Evaluation:  Test  Data  Set 

The  Test  Data  was  next  processed  to  determine  the  effectiveness  of  the  parameters 
found  during  system  development,  as  shown  in  Table  4.7.  Analysis  of  results  from  the 
test  data  was  used  to  determine  if  the  system  parameters  were  over  tuned  for  the  Training 
Data  Set.  From  this  analysis,  the  parameters  were  “tweaked”  to  improve  generalization 
before  processing  the  final  Evaluation  Data. 

4-3.1  Focus  of  Attention  Module.  Using  the  17  Test  Data  images,  the  Focus  of 
Attention  module  was  able  to  detect  all  of  the  20  microcalcification  areas  with  an  average 
of  44.65  ROIs  per  image.  Table  4.8  breaks  down  the  results  for  each  image.  The  parameters 
for  the  global  and  local  thresholds  determined  during  training  performed  well  against  the 
Test  Set  by  identifying  100%  of  the  microcalcification  regions  in  the  17  Test  Set  images. 
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Image 

Number  of  Correct 
Regions 

Rank 

Total  Number 
of  Regions 

AF092 

1/1 

4 

90 

AF094 

1/1 

13 

74 

AF102 

1/1 

6 

24 

AF119 

2/2 

1,2 

36 

AF121 

1/1 

1 

28 

AF128 

3/3 

2,3,4 

31 

AF130 

1/1 

12 

28 

AF141 

1/1 

1 

51 

AF150 

1/1 

2 

61 

AF160 

1/1 

35 

63 

AF162 

1/1 

30 

53 

AF168 

1/1 

2 

84 

AF170 

1/1 

5 

43 

AF186 

1/1 

10 

20 

AF192 

1/1 

14 

18 

AF202 

1/1 

1 

26 

AF204 

1/1 

4 

29 

Total 

20/20 

7.6  (mean) 

759 

Table  4.8  Results  of  Focus  of  Attention  Module  using  Testing  Data 


The  rankings  for  the  regions  were  more  spread  out,  ranging  from  1  to  35,  but  with  the 
majority  in  the  top  15.  Selecting  the  top  15  regions  would  result  in  a  90%  Probability  of 
Detection  with  an  average  5.12  False  Regions  per  Test  Data  Set  image. 

J^.3.2  Indexing  Module.  After  the  FOA  identified  the  initial  ROIs,  the  Indexing 
module  processed  the  ROIs  using  the  parameters  set  during  System  Development.  Using 
these  threshold  values,  17  of  the  20  true  ROIs  in  the  Test  Data  Set  were  correctly  hy¬ 
pothesized  with  an  average  false  ROI  rate  of  4.9  ROIs  per  image.  Analysis  of  the  results 
indicated  one  region  was  lost  due  to  having  less  than  three  individual  micro  calcifications 
identified  in  the  ROI.  The  remaining  two  ROIs  did  not  meet  the  Laws  LER  ratio  thresh¬ 
olds.  A  FROG  analysis  was  done  to  determine  if  a  new  threshold  value  should  be  set. 
Figure  4.5  shows  the  results  of  varying  each  parameter. 

By  lowering  the  L5E5  LER  threshold  to  0.0287,  one  of  the  missed  ROIs  can  be 
detected,  increasing  the  Probability  of  detection  from  85%  to  90%  on  the  Test  Set.  The 
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Average  Number  o1  False  ROIs  per  Image  Average  Number  o1  Falsa  ROIs  per  Image 


(a)  (b) 

Figure  4.5  FROG  Analysis  of  Test  Data  for  (a)  varying  L5R5  LER  and  (b)  varying  L5E5 
LER 

False  ROI  Rate  for  the  Test  Data  Set  increased  from  4.9  to  6.47  ROIs  per  image  using 
the  lower  L5E5  LER  threshold.  To  get  the  second  missed  ROI,  both  the  L5R5  and  L5E5 
parameters  had  to  be  lowered  which  caused  an  unacceptable  number  of  false  alarms  to 
pass  through  this  stage.  This  parameter  was  changed  before  processing  the  Evaluation 
and  Normal  Data  sets.  Checking  the  effect  of  changing  the  parameter  on  the  Training 
Data  Set,  the  Pd  remained  at  100%  while  the  False  ROI  Rate  increased  from  3.2  to  5.5 
false  ROIs  per  training  set  image. 

4-3.3  Matching.  The  texture  features  were  extracted  for  the  Test  Data  ROIs 
passed  by  the  Indexing  Module.  These  features  were  evaluated  with  the  trained  networks 
from  the  system  evaluation  with  the  training  data.  Table  4.9  shows  the  performance  of 
each  feature  set  and  the  corresponding  number  of  hidden  nodes  in  the  neural  network. 
The  Angular  Second  Moment  Features  provided  little  false  ROI  reduction  while  lowering 
Probability  of  Detection.  The  Laws  Energy  Ratio  features  cut  the  false  ROI  rate  by  over 
a  factor  of  2,  while  having  the  same  Probability  of  Detection  as  the  ASM  features.  The 
Power  Spectrum  Analysis  features  had  a  slightly  lower  Probability  of  Detection,  but  had 
the  lowest  false  ROI  rate. 

The  results  from  the  LER  and  PSA  features  sets  were  analyzed  to  determine  which 
regions  were  missed.  For  the  LER  feature  set,  the  microcalcification  regions  from  images 
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Feature 

Set 

Hidden 

Nodes 

Probability  of 
Detection 

False  ROI 
Rate 

4 

0.75 

6.35 

LER 

4 

0.75 

2.59 

PSA 

7 

0.70 

1.82 

Table  4.9  Test  Data  Set  System  Results  including  Matching  Module 


Feature  Set 

Feature 

Saliency  Value 

Selected 

Laws 

Energy 

Ratio 

L5S5 

0.4793 

V 

L5E5 

0.1759 

L5W5 

V 

S5S5 

E5L5 

0.2011 

E5S5 

0.9610 

V 

Power 

Spectrum 

Analysis 

ROI 

0.2188 

R12 

0.6184 

V 

R24 

0.5508 

V 

R48 

0.1503 

R816 

0.9892 

v/ 

R1632 

0.3991 

Table  4.10  Ruck  Saliency  Values  for  LER  and  PSA  Feature  Sets 


AF130,  AF150,  and  AF162  were  incorrectly  classified.  For  the  PSA  feature  set,  the  mi¬ 
crocalcification  regions  from  images  AF130,  AF160,  AF170  and  AF202  were  misclassifled. 
Notice  how  only  one  common  image  was  missed  by  both  feature  sets.  A  combination  of 
features  from  the  LER  and  PSA  feature  sets  were  selected  using  the  Ruck  Saliency  Metric 
to  pick  the  top  three  features  from  each  feature  set.  Table  4.10  gives  the  saliency  values  for 
each  feature  and  which  features  were  selected  for  use  in  combination.  Using  these  features, 
the  system  Probability  of  Detection  Rate  increased  to  80%  and  along  with  the  False  ROI 
Rate  to  3.88  using  a  neural  network  with  2  hidden  nodes.  Using  these  features,  the  regions 
in  images  AF130,  AF150  and  AF202  were  correctly  classified. 


4-4  System  Evaluation:  Evaluation  and  Normal  Data  Sets 

The  full  system  with  the  criteria  listed  in  Table  4.11  using  the  Angular  Second  Mo¬ 
ment,  Laws  Energy  Ratio,  Power  Spectrum  Analysis  and  LER/PSA  combination  feature 
sets  was  used  to  evaluate  the  system  performance  using  the  unseen  Evaluation  and  Normal 
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Parameter 

Value 

Global  Threshold 

0.5% 

Local  Threshold 

2.0 

Number  of  Clusters/ROI 

>3 

L5R5  LER 

>0.0083 

L5E5  LER 

>0.0287 

4  (ASM) 

Hidden  Nodes 

4  (LER) 

7  (PSA) 

2  (LER/PSA) 

Table  4.11  Final  System  Criteria  Used  for  Evaluation/Normal  Data  Sets 


Data 

Set 

ASM 

Features 

LER 

Features 

PSA 

Features 

LER/PSA 

Features 

wm 

FRR 

wsm 

FRR 

Pd 

FRR 

wsm 

FRR 

Evaluation 

0.75 

6.25 

0.75 

3.67 

0.75 

5.75 

Normal 

- 

4.6 

- 

3.2 

- 

1.7 

- 

3.3 

Table  4.12  System  Results  on  Evaluation  and  Normal  Data  Sets 


Data  sets.  Table  4.12  lists  the  Probability  of  Detection  and  False  ROI  rates  for  these  data 
sets. 

The  Probability  of  Detection  rate  was  fairly  constant  for  all  the  feature  sets.  This 
reflects  the  system  should  perform  at  approximately  this  level  for  any  data  set.  The  False 
ROI  Rate  was  slightly  higher  for  the  LER,  PSA  and  combination  feature  sets.  This  may  be 
caused  by  the  images  that  made  up  the  Evaluation  and  Normal  Data  sets.  These  images 
were  digitized  from  slightly  older  films  taken  with  a  different  X-Ray  system  than  those 
used  in  the  Training  and  Testing  Data  sets.  The  FOA  module  did  correctly  identify  100% 
of  the  microcalcifications  in  the  Evaluation  Data  set,  but  the  hypothesis  from  the  Indexing 
Module  was  incorrect  for  2  regions  out  of  the  14  radiologist  identified  microcalcification 
clusters.  These  results  validate  the  system  FOA  thresholds  were  not  over  tuned  to  the 
Training  and  Test  Data.  The  Matching  Module  incorrectly  identified  the  remaining  1  or  2 
regions  for  each  feature  set. 

The  system  had  approximately  the  same  False  ROI  Rate  on  the  Normal  Data  set 
as  with  the  data  sets  containing  microcalcifications.  Analysis  of  the  results  showed  that 
the  majority  of  the  false  detections  were  from  images  AF263  and  AF273.  It  was  unknown 
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Image 

Number 

Number  of 
Indexed  Regions 

Final  Number  of 
False  ROIs  Reported 

3 

2 

229 

5 

2.5 

236 

4 

1.5 

244 

1 

0.5 

246 

2 

0.5 

247 

3 

1 

263 

15 

9 

273 

15 

11 

275 

1 

0.5 

286 

6 

3.5 

Total 

55 

3.2 

Table  4.13  Average  Number  of  False  Regions  per  Image  Reported  in  Normal  Data  Set 
for  the  Four  Feature  Sets 

why  these  two  images  accounted  for  the  majority  of  false  detections.  Table  4.13  shows  the 
number  of  False  ROIs  passed  by  the  Indexing  stage  and  the  average  number  of  false  ROIs 
per  image  reported  by  the  system  for  the  four  different  feature  sets. 


4.-5  Summary 

The  Model  Based  Vision  Microcalcification  Detection  System  was  developed  and 
evaluated  using  53  images  with  a  total  of  54  microcalcification  regions.  Three  different 
texture  measure  features  were  examined  and  Fisher  Ratio  analysis  was  applied  to  select 
the  features  to  be  used  in  a  neural  network  classifier.  The  best  overall  performance  from 
the  individual  feature  sets  using  the  Training,  Testing,  Evaluation  and  Normal  data  sets 
was  achieved  with  the  Power  Spectrum  Analysis  features  resulting  in  an  83%  Probability 
of  Detection  with  a  False  ROI  Rate  of  2.17  regions  per  image.  This  is  a  comparable 
result  to  the  published  capabilities  of  approximately  83-85%  Pd  with  2-5  False  Regions 
per  image[13,  42].  The  Power  Spectrum  Analysis  features  performed  slightly  better  than 
the  Laws  Energy  Ratio  features  in  terms  of  False  ROI  Rate.  Both  had  the  same  Pd  of  83%. 
Table  4.14  breaks  out  the  performance  of  the  system  for  each  data  and  feature  set.  These 
two  feature  sets  gave  better  results  than  the  Angular  Second  Moment  features  which  have 
been  used  in  other  research]!?,  9].  By  creating  a  combination  feature  set  based  on  Ruck 
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Data 

Set 

ASM 

Features 

LER 

Features 

PSA 

Features 

LER/PSA 

Features 

Pd 

FRR 

Msm 

FRR 

wsm 

FRR 

1^ 

FRR 

5.21 

0.94 

2.36 

0.94 

1.64 

1.00 

3.07 

0.75 

6.35 

0.75 

2.59 

0.70 

1.82 

0.80 

3.88 

Evaluation 

0.75 

6.25 

0.75 

4.58 

0.81 

3.67 

0.75 

5.75 

Normal 

- 

4.6 

- 

3.2 

- 

- 

Overall 

0.81 

5.6 

0.83 

mm 

Table  4.14  Overall  System  Results  for  Each  Data  and  Feature  Set 


Saliency  of  the  LER  and  PSA  feature  sets,  an  overall  Pd  of  85%  with  a  False  ROI  Rate  of 
4.0  was  achieved. 
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V.  Conclusions 


5. 1  Introduction 

This  chapter  provides  a  summary  of  the  research  accomplished.  The  goal  of  this 
thesis  was  to  develop  a  Model  Based  Vision  system  capable  of  identifying  regions  of  mi¬ 
crocalcifications  in  a  digitized  mammogram.  The  system  identifies  regions  which  contain 
microcalcifications,  but  does  not  classify  them  as  malignant  or  benign.  A  number  of  unique 
developments  in  the  area  of  feature  extraction  and  classification  were  presented. 

5.2  Summary  of  Methodology 

Following  a  Model  Based  Vision  paradigm  for  computerized  detection,  the  system 
was  composed  of  5  separate  modules.  The  first  module,  Focus  of  Attention,  used  a  three 
step  process  in  identifying  potential  regions  of  interest  (ROI).  The  digitized  mammogram 
was  first  subjected  to  a  non-linear  remapping  of  the  gray  levels  to  improve  the  contrast  and 
dynamic  range  of  the  microcalcifications.  This  image  was  then  filtered  with  a  Hit/Miss 
filtering  combination.  The  third  and  final  step  in  the  FOA  module  was  a  combination  of 
global  and  local  thresholding  to  remove  the  areas  not  corresponding  to  microcalcifications. 
The  FOA  module  was  modeled  after  work  performed  by  Chan[27].  Implementing  the 
Hit/Miss/Thresholding  technique  on  a  new  database  confirms  the  potential  of  this  method 
for  segmenting  microcalcifications.  Augmenting  Chan’s  method  with  the  non-linear  pre¬ 
processing  allowed  the  thresholds  to  be  set  higher,  reducing  the  number  of  false  regions 
from  being  segmented.  The  FOA  module  correctly  segmented  100%  of  the  microcalcifica¬ 
tion  regions  while  eliminating  over  90%  of  the  image  from  further  processing. 

The  ROIs  identified  by  the  FOA  module  were  assigned  an  initial  hypothesis  generated 
by  the  Indexing  Module.  This  hypothesis  was  a  function  of  the  number  of  individual 
calcifications  identified  in  the  ROI  and  a  novel  texture  energy  measure  called  the  Laws 
Energy  Ratio.  The  Laws  Energy  Ratio  compared  the  amount  of  energy  in  the  pixels 
identified  as  part  of  a  microcalcification  in  the  ROI  to  the  overall  energy  of  the  ROI  which 
has  been  filtered  with  the  Laws  masks  L5E5  and  L5R5.  The  Indexing  Module  correctly 


indexed  93%  of  the  microcalcification  regions  with  an  average  False  ROI  Rate  of  7.55 
regions  per  image  over  53  images. 

The  ROIs  assigned  an  initial  hypothesis  of  being  a  region  of  microcalcifications  had 
a  number  of  features  extracted  based  on  three  different  texture  measures:  Angular  Second 
Moment,  Laws  Energy  Ratios  and  Power  Spectrum  Analysis.  The  Prediction  Module  used 
Fisher  Ratio  analysis  to  determine  the  top  6  features  from  each  feature  set  obtained  by  the 
Feature  Extraction  Module.  These  features  were  then  set  to  the  final  Matching  Module. 

The  Matching  module  implemented  a  Multilayer  Perceptron  Neural  Network  trained 
using  a  modified  backpropagation  algorithm  to  classify  the  ROIs  as  normal  or  microcal¬ 
cification  tissue.  A  novel  application  of  qualitatively  selecting  the  best  feature  subset  for 
microcalcification  identification  was  accomplished.  Ruck  Saliency  metrics  were  applied  to 
identify  the  most  relevant  features  in  the  LER  and  PSA  feature  sets  to  create  a  combined 
feature  set  resulting  in  an  increased  Probability  of  Detection. 

5. 3  Summary  of  Results 

In  the  first  documented  comparative  study  of  texture  measures  for  microcalcification 
detection  on  a  single  database,  Power  Spectrum  Analysis  features  had  the  best  overall 
performance,  identifying  83%  of  the  microcalcification  regions  with  an  average  2.17  false 
regions  per  image.  These  results  were  verified  using  an  independent  Evaluation  Data  Set 
to  confirm  the  system  was  not  biased  to  the  Training  Data.  This  is  comparable  to  other 
research  which  has  obtained  an  85%  detection  rate  with  2  false  regions  per  image[27]  and 
83%  with  5  false  regions  per  image[42].  A  combination  of  LER  and  PSA  features  based  on 
Ruck  Saliency  metrics  were  selected  in  an  attempt  to  improve  the  classification  accuracy. 
The  combination  of  features  resulted  in  an  overall  correct  classification  rate  of  85%  with  4 
false  regions  per  image. 

Although  the  system  was  not  designed  to  classify  the  microcalcification  regions  as 
malignant  or  benign,  it  is  interesting  to  note  that  89%  of  the  malignant  microcalcification 
regions  were  correctly  identified  using  the  combination  of  PSA  and  LER  features.  This 
reflects  the  system  being  more  sensitive  to  the  cancerous  regions.  A  logical  extension  to  this 
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research  would  be  to  have  an  additional  stage  to  classify  the  identified  microcalcification 
regions  as  malignant  or  benign. 

5.4  Conclusion 

This  research  explored  the  application  of  Model  Based  Vision  to  the  detection  of 
microcalcifications.  A  number  of  novel  techniques  were  explored  for  this  research: 

•  The  Hit/Miss  filtering  technique  was  effective  in  increasing  the  signal  to  noise  ra¬ 
tio  of  the  microcalcifications  sufficiently  such  that  a  global  and  local  thresholding 
combination  could  accurately  segment  those  regions.  Preprocessing  the  image  im¬ 
proved  performance.  Frequency  analysis  of  the  Hit/Miss  filtering  technique  showed 
consistent  results  with  other  research  in  wavelet  based  detection[21,  42]. 

•  A  novel  texture  feature,  the  Laws  Energy  Ratio,  was  effective  in  separating  nor¬ 
mal  and  abnormal  tissue  regions  in  the  Indexing  Module,  correctly  indexing  93%  of 
the  microcalcification  regions.  Using  the  new  features  for  classifying  the  region  as 
normal  or  microcalcification  tissue  yielded  competitive  results  of  83%  Probability  of 
Detection  with  an  average  3.09  False  ROIs  per  image  on  53  images. 

•  In  the  first  documented,  direct  comparative  study  of  three  different  texture  measures 
for  the  classification  of  normal  and  microcalcification  tissue,  the  Power  Spectrum 
Analysis  feature  set  had  the  best  overall  performance  with  an  83%  Probability  of 
Detection  with  an  average  2.17  False  ROIs  per  image. 

•  A  neural  network,  trained  with  a  modified  backpropagation  algorithm  using  a  com¬ 
bination  feature  set  derived  from  a  quantitative  feature  selection  method  was  able 
to  increase  the  Probability  of  Detection,  correctly  identify  85%  of  the  radiologist 
identified  microcalcifications  with  an  average  of  4  False  ROIs  per  image. 

This  research  successfully  met  the  objective  of  developing  a  complete,  end  to  end 
Microcalcification  Detection  System  as  stated  in  Chapter  I.  The  system  was  developed 
and  evaluated  using  independent  data  sets.  The  final  performance  of  the  system  should 
be  a  reasonable  indication  of  system  performance  on  any  novel  data  set. 
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Appendix  A.  Database  Information 

The  following  tables  list  the  images  used  for  each  data  set.  The  locations  given  are 
the  center  [row, column]  locations  of  the  microcalcification  regions  for  a  2048  by  1024  image. 
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IMAGE 


AF005 


AF006 


AF007 


AF008 


AF009 


AF020 


AF022 


AF024 


AF033 


AF038 


AF040 


AF045 


AF047 


AF055 


DIAGNOSIS 


Malignant 


Malignant 


Malignant 


Malignant 


Malignant 


Malignant 


Benign 


Benign 


Malignant 


Benign 


Benign 


Be 


Benign 


Malignant 


REGIONS 


2 


LOCATIONS 


[976, 826], [504, 665] 


[1194,319], [1165,363], [956,208] 


[603,533],  [477,533] 


[943.416] 


[1410,453] 


[709,199] 


[734,524] 


[1082,654] 


[462,717] 


[1154,345] 


[1298,317] 


[841,344] 


[1548,607] 


[1313,824] 


Table  A.l  Training  Data  Set  Information 


IMAGE  DIAGNOSIS 


AF092  Benign 


REGIONS 


AF092 

Benign 

AF102 

Malignant 

AF119 

Benign 

AF130 


AF141 


AF150 


AF160 


AF162 


AF168 


AF170 


AF186 


AF 


AF2 


AF204 


Benign 


Benign 


Benign 


Malignant 


Benign 


Benign 


Be 


Benign 


Benign 


AF: 

Benign 

AF128 

Malignant 

LOCATIONS 


[1274,747] 


[1514,576] 


[579,410] 


[865, 656],  [763,675] 


[895,448] 


[662,474] ,  [698,457] ,  [758,482] 


[1097,372] 


[592,588] 


[322,71] 


[960,789] 


[1263,690] 


[925,257] 


[851,611] 


[1379,117] 


[1033,761] 


[1124,282] 


_ [1209,318] 


Table  A. 2  Testing  Data  Set  Information 
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IMAGE 


DIAGNOSIS 


REGIONS 


LOCATIONS 


AF224 

Malignant 

2 

[1453, 303], [1391, 339] 

AF226 

Malignant 

1 

[1325,586] 

AF240 

Malignant 

1 

[1107,90] 

AF241 

Malignant 

2 

[236,936]  [364,1064] 

AF259 

Benign 

1 

[1410,453] 

AF261 

Benign 

1 

[1650,840] 

AF264 

Benign 

1 

[1621,239] 

AF266 

Benign 

1 

[1240,167] 

AF267 

Malignant 

2 

[778, 552], [726, 574] 

AF269 

Malignant 

1 

[1356,290] 

AF282 

Benign 

1 

[707,159] 

AF284 

Benign 

2 

[1156,114], [1184,162] 

Table  A. 3  Evaluation  Data  Set  Information 


IMAGE 

DIAGNOSIS 

REGIONS 

LOCATIONS 

AF214 

Normal 

- 

- 

AF229 

Normal 

- 

- 

AF236 

Normal 

- 

- 

AF244 

Normal 

- 

- 

AF246 

Normal 

- 

- 

AF247 

Normal 

- 

- 

Normal 

- 

- 

Normal 

- 

- 

AF275 

Normal 

- 

- 

AF286 

Normal 

- 

- 

Table  A. 4  Normal  Data  Set  Information 
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Appendix  B.  Computer  Code 

The  following  sections  contain  the  computer  code  used  during  this  research.  Coding 
was  accomplished  using  multiple  image  processing  environments  including  MATLAB  and 
programming  directly  in  C. 

B.l  MATLAB  Code 

The  following  M- files  were  used  in  the  MATLAB  environment.  All  main  M-files  and 
any  function  calls  are  included  for  completeness.  Each  M-file  will  be  separated  by  two  rows 
of  %  symbols. 


’/oThis  program  will  take  the  input  image  file  name,  use  the  defined 
'/.parameters  and  perform  the 

'/, hit/miss  filtering  operation  and  the  local  thresholding  operation. 

‘/.The  surviving  rois  are  tested  for  number  and  size  of  possible 
'/.microcalcifications.  A  binary  mask  and  the  x,y  coordinates  will 
'/.be  returned  to  the  main  program. 

•/. 

'/. 

'/.FUNCTIONS  CALLED  DURING  micro_det_sys  .m: 

'/. 

'/.local_thres :  C-program  for  local  thresholding 

'/.  histo;  MEX  file  for  finding  a  histogram  of  a  gray  level  image 

'/.  main_seg:  M-file  for  finding  minimm  niimber  of  rois 

'/.  raw2viff:  Khoros  routine  to  convert  file  to  viff  type  file 

'/.  vpebble:  Khoros  routine  to  find  non-connected  pixel  groups 

'/.  and  remove  groups  larger  than  or  smaller  them  a 
'/.  specified  number 

'/.  cluster:  M-file  to  find  number  of  non-connected  pixel  groups 

'/.  find_asm:  M-file  to  extract  angular  second  moment  features 

'/.  find_ring:  M-file  to  extract  power  spectrum  features 

'/. 

'/. 

f \inct ion  [asm_good , asm_bad , ler_good , ler_bad , psa_good , psa_bad , combo_good , 
combo_bad, keep, toss] =micro_det_sys (file) ; 


'/.define  parameters 
ws=64; 
gthres=0 . 5 ; 
lthres=2.0; 
lws=51 ; 

min_num_clusters=3 ; 
min_LR_LER=0 . 00829 ; 
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min_LE_LER=0 . 0287 ; 


’/.load  mamo 

mainopath=  ’  /home/pinnal/bdata/wpafbh/’ ; 
f ilenaine=  [mamopath  file]  ; 

if  str2n\im(file(3:5)<=204 

’/.Training  and  Testing  Data  Sets  open  using  1024  by  2048  size 

f id=f open (filename , ’r’) ; 

X=fread(f id, [2048  1024] , ’ushort ’ ) ; 
f close (fid) ; 

’/.remove  tags  from  selected  images 
if  sum(file==’af007’)==5 

X(l;600,l:150)=zeros(size(X(l:600, 1:150))); 

elseif  sum(f ile==’af005’ )==5 
X ( 1 : 200 , 1 : 400) =zeros (size (X (1 : 200 , 1 : 400) ) ) ; 

elseif  sum(file==’af006’)==5 

X(l: 100,480: 1024)=zeros(size(X(l: 100,480: 1024))); 
elseif  sum(f ile==’ af008 ’ )==5 

X ( 100 : 600 , 800 : 1024) =zeros (size (X (100 : 600 , 800 : 1024) ) ) ; 
elseif  sum(file==’af020’)==5 

X ( 1 : 400 , 750 : 1024) =zeros (size (X ( 1 : 400 , 750 : 1024) ) ) ; 
elseif  sum(file==’af022’)==5 

X ( 1 : 200 , 500 : 1024) =zeros (size (X ( 1 : 200 , 500 : 1024) ) ) ; 
elseif  siam(f ile==’af024’ )==5 

X ( 1 : 400 , 800 : 1024) =zeros (size (X ( 1 : 400 , 800 : 1024) ) ) ; 
elseif  sum(f ile==’af038’ )==5 

X (50 : 300 , 750 : 1024) =zeros (size (X (50 : 300 , 750 : 1024)  )  )  ; 
elseif  sum(file==’af092’)==5 

X ( 1 : 150 , 700 : 1024) =zeros (size (X ( 1 : 150 , 700 : 1024) ) ) ; 
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elseif  suin(file==’afl21’)==5 

X ( 100 : 225 , 300 : 425) =zeros (size (X ( 100 : 225 ,300:425))); 

else 

X=X; 

end; 

else ; 

'/oEvaluation  and  Normal  Data  Sets  open  using  1124  by  2048  size 

fid=f open (filename , ’r ’ ) ; 

X=fread(f id, [2048  1124] , 'ushort ’ ) ; 
f close (fid) ; 

‘/.remove  tags  from  selected  images  and  crop  images 
if  sum(file==’af224’)==5 

X ( 100 : 400 , 700 : 1 124) =zeros (size (X (100 : 400 ,700:1124))); 

X=X(: ,1:1024); 

elseif  sum(file==’af240’)==5 

X (1 : 100 , 500 : 1124) =zeros (size (X (1 : 100 , 500 : 1124) ) ) ; 

X=X(: ,1:1024) ; 

elseif  s\im(file==’af259’)==5 
X ( 1 : 200 , 1 : 700) =zeros ( size (X ( 1 : 200 , 1 : 700) ) ) ; 

X=X(: ,101:1124); 

elseif  sum(file==’af284’)==5 

X (200 : 600 , 900 : 1 124) =zeros (size (X (200 : 600 ,900:1124))) ; 

X=X(: ,1:1024); 

elseif  (sum(file==’af226’ )==5  I  sum(file==’af241’)==5  | 
sum(file==’af261’)==5  |  sum(file==’af267’)==5  I 
sum(file==’af269’ )==5) 

X=X(: ,101:1124) ; 

elseif  sum(file==’af214’)==5 

X ( 1 : 250 , 550 : 1000) =zeros (size (X ( 1 : 250 , 550 : 1000) ) ) ; 

X=X(: ,1:1024) ; 

elseif  sum(file==’af273’ )==5 

X(1 : 100, 1 :600)=zeros(size(X(l : 100,1:600))) ; 

X=X(: ,101:1124); 
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elseif  siim(file==’af286')==5 

X ( 1 : 200 , 800 : 1124) =zeros (size (X (1 : 200 , 800 : 1124) ) ) ; 

X=X(: ,1:1024); 

elseif  (sum(f ile==’af247’ )==5  |  sum(file==’af263’)==5  I 
sum(file=='af275’)==5  ) 

X=X(: ,101:1124); 

else 

X=X(: ,1:1024) ; 

end; 

end; 

•/.write  out  for  local  thresholding  USE  ONLY  THE  ORIGINAL  IMAGE 

f id=fopen( ’local_thres_img’ , ’wb’) ; 
fwrite(f id,X, ’ushort’) ; 
f close(f id) ; 

param=[lthres  Iws] ; 

f id=fopen( ’local_param’ , ’wb’) ; 
fwrite(f id,param, ’float’) ; 
f close(f id) ; 


•/.•/.FOCUS  OF  ATTENTION 

•/.Call  local  thresholding  program 

! local_thres ; 

"/.perform  the  sigmoid  adjustment 

B  =  .003; 
xO  =  3100; 

Y  =  4000. /(I  +  exp(-B*(X  -  xO))); 

Y  =  .05*X  +  Y; 

Y=round(Y) ; 

clear  X  B  xO; 

•/.perform  hit/miss  filtering/thresholding 
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load  hmtf liter; 


hm=conv2(Y,hmtf liter, ’same’) ; 

clear  Y  hmtfllter  param  Ithres  Iws  slg; 

off set=mln(mln(hm) ) ; 
hm=hm-off set ; 

clear  offset; 

y.flnd  top  pixels  to  keep  7,7, 

high  =  max(max(hm)) ; 
low  =  mln(mln(hm) ) ; 

[nTim_plx,gl]  =  hlsto(hm,hlgh,low,l)  ; 

total=sum(nimi_plx) ; 
llmlt=total*(l-gthres/100) ; 
sum_plx=0; 

for  gt_level=l:4096; 
sum_plx=siim_plx  +  num_plx(gt_level)  ; 

If  sum_plx>=llmlt ; 

break; 

end; 

end; 

hmtmask=hm>=gt_level ; 

clear  gt_level  sum_plx  num_plx  total  limit  high  low  gl  hm; 

7,load  local  thresholding  mask 

f ld=fopen( ’local_mask’ , ’r ’ ) ; 
ltmask=fread(f Id,  [2048  1024] , ’float’) ; 
f closeCf Id) ; 

7,loglcally  AND  the  hmtmask  and  Itmask 

IMG=hmtmask&ltmask ; 
clear  Itmask; 

7,wrlte  out  hmtmask  for  pixel  reduction  by  Khoros 
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f id=fopen( ’hmtmask’ , ’ wb’ ) ; 
f write (f id, hmtmask, ’uintl’) ; 
f close (fid) ; 
clear  hmtmask; 

!raw2viff  -i  hmtmask  -o  vfilel  -r  1024  -c  2048  -t  bit 
Ivpebble  -i  vfilel  -o  hmtmaskr  -val  1  -min  4  -max  45 

IMG=IMG’ ; 
roi_size=ws ; 


main_seg; 
clear  IMG; 

‘/.read  in  reduced  mask  with  clusters  >3  pixels  and  <45  pixels 

fid=f open ( ’hmtmaskr ’ , ’r’ ) ; 
head=fread(f id, 1024, ’char’) ; 

IMG=f read (fid, [2048  1024] , ’uintl ’) ; 
f close(f id) ; 

'/,*/, create  the  25  laws  matrices 


L5=[l  4  6  4  1]; 
S5=[-l  0  2  0  -1] 
R5=[l  -4  6  -4  1] 
E5=[-l  -2  0  2  1] 
W5=[-l  2  0-2  1] 


7,7,  local  average 

7,7,  spot  detector 

7.7,  edge  detector 

7.7,  ripple  detector 

7.7,  wave  detector 


L5L5=L5’*L5 
L5S5=L5’*S5 
L5R5=L5 ’ *R5 
L5E5=L5’*E5 
L5W5=L5’*W5 

S5L5=S5’*L5; 

S5S5=S5’*S5; 

S5R5=S5’*R5; 

S5E5=S5’*E5; 

S5W5=S5’*W5; 

R5L5=R5 ’ *L5 ; 
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R5S5=R5’*S5; 
R5R5=R5’*R5; 
R5E5=R5 ’ *E5 ; 
R5W5=R5’*W5; 

E5L5=E5 ’ *L5 ; 
E5S5=E5’*S5; 
E5R5=E5’*R5; 
E5E5=E5’*E5; 
E5W5=E5'*W5; 

W5L5=W5’*L5; 
W5S5=W5'*S5; 
W5R5=W5'*R5; 
W5E5=W5  >  *E5 ; 
W5W5=W5 ’ *W5 ; 


mask= [ 
’L5L5> 
>L5S5’ 
'L5R5> 
’L5E5’ 
’L5W5> 
’S5L5> 
’S5S5' 
’S5R5’ 
’S5E5’ 

’ S5W5 ’ 
'R5L5’ 
’R5S5’ 
’R5R5’ 
'R5E5’ 
’R5W5’ 
’E5L5’ 
’E5S5’ 
'E5R5’ 
>E5E5’ 
’E5W5’ 
’W5L5’ 
’W5S5’ 
’W5R5’ 
’W5E5’ 
'W5W5’ 
]; 
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index_mask= [ 
’L5R5’ 

’L5E5’ 

]; 


'/(list  of  FOA  roi  center  coordinates 

xc=out2( : ,2)  ; 
yc=out2 ( : , 1) ; 

xct=xc-ws/2; 
xcb=xc+ws/2; 
ycl=yc-ws/2 ; 
ycr=yc+ws/2; 

‘/open  orginal  image  again 

fid=f  open  (filename ,  ’rO  ; 

X=fread(fid, [2048  1024] , ’ushort’) ; 
f close(f id) ; 

'/.index  rois  and  get  features  for  surviving  rois 
num_rois=soutl ; 

‘/.start  checking  each  roi  for  indexing, 

'/.  feature  extraction  and  matching 

for  i=l  :niim_rois ; 

‘/.check  roi  for  extraction 

if  xc(i)>32  I  yc(i)>32; 

oroi=X(xct(i)  :xcb(i)  ,ycl(i)  ;ycr(i))  ;  ‘/.original  image  roi 
mroi=IMG(xct(i)  :xcb(i)  ,ycl(i)  :ycr(i))  ;  ’/.mask  image  roi 

else 

break; ‘/.roi  center  too  close  to  edge  of  image 
end; 

if  sum(sum(mroi))==0; 

break; ‘/.do  not  process  rois  with  out  a  cluster 
end; 
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y.y.iNDExiNGy.7. 


’/.get  cluster  information 

[num_cls ,EN,D, cnts  jCsize]  =cluster(niroi)  ; 

y,get  laws  info  for  indexing 

y,make  rois  64x64  for  laws  and  fft  processing 

orois=oroi(l : 64,1:64) ; 
mrois=mroi (1:64, 1:64) ; 

‘/ousing  ogrinal  image 

for  j=l : size(index_mask, 1) ; 

eval( ['x=conv2(orois, ’  index_mask( j , : )  ’.’’valid’’);’]); 
x=x.*(x>=0) ; 
total=sum(sum(x) ) ; 

region=sum(sum(x . *mrois (3:62,3: 62) ) ) ; 
index_laws ( j ) =region/total ; 

end; 

if  (niim_cls>=min_num_clusters  &  index_laws(l)>=min_LR_LER 
&  index_laws(2)>=min_LE_LER) ; 

y.y.y,  possible  microcalcification  roi  y.y,y.y.y.y,y. 

•/.’/.’/.FEATURE  EXTRACTION’/,’/,’/.’/. 

’/.get  laws  ratios 
’/.using  ogrinal  image 

for  j=l : size (mask, 1) ; 

eval( [’x=conv2(orois, ’  mask(j,:)  ’.’’valid’’);’]); 
x=x.*(x>=0) ; 
total=sum(sum(x) ) ; 

region=sum(sum(x . *mrois (3:62,3: 62) ) ) ; 
ler_f eature ( j ) =region/total ; 

end; 
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y.get  asm  features  for  [0,0]  to  [4,4] 
asm_feature=f indasm(oroi,4) ; 

‘/.get  psa  features 
psa_feature=f indring(orois) ; 

‘/.single  feature  vector  containing  all  features 

features=[str2num(file(3:5))  xc(i)  yc(i)  num_cls 
ler_feature  asm_feature  psa_feature] ; 

‘/oruiming  total  of  all  features  for  indexed  rois 

keep=[keep;  str2num(file(3:5))  xc(i)  yc(i)  num_cls 
ler_feature  asm_feature  psa_feature] ; 

•/,•/,•/, MATCHING  WITH  NEURAL  NETWORK*/.*/.'/.*/, 

'/.'/.'/.USING  ASM  FEATURES'/.'/.'/, 

ASM=[1  features (: ,30)  features (: ,36)  features (: ,42) 
features (: ,48)  features( : ,51)  features( : ,54)] ; 
data=ASM; 

load  asmweights  '/,'/,  4  middle  nodes 
train_data=nn_data_train; 

Wl=wl_4; 

W2=w2_4; 

ave=mean(train_data( : ,2:1+1)) ; 
dev=std(train_data( ; ,2:1+1)) ; 

average=ones(n,l)  *  ave; 
sigma=ones (n, 1)  *  dev; 

data(: ,2 : I+l)=(data( : ,2:1+1) -average) ./sigma; 
data=data’ ; 

zl  =  1  ./  (1  +  exp(-Wl  *  [data(2:I+l,l);l])); 
z2  =  1  ./  (1  +  exp(-W2  *  [zl;  1])); 

if  z2>=0.2647 
asmguess  =  1; 
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else 

asmguess  =  0; 
end; 

if  asmguess==l 

asm_good= [asm_good; features (2:3)] ; 
else 

asm_bad= [asm_bad; features (2:3)]; 

•/.'/.“/.USING  LER  FEATURES’/.'/.’/. 

LER=[1  f eatures ( : ,6)  features (: ,8 ;9) 
features ( : ,11)  features ( : ,20:21)] ; 
data=LER; 

load  lerweights  ’/,’/,  4  middle  nodes 
train_data=nn_data_train; 

Wl=wl_4; 

W2=w2_4; 

ave=mean(train_data( : ,2 : I+l) ) ; 
dev=std(train_data( : ,2 : I+l) ) ; 

average=ones(n, 1)  *  ave; 
sigma=ones(n,l)  *  dev; 

data(: ,2:I+l)=(data( : ,2 : I+l) -average) ./sigma; 
data=data’ ; 

zl  =  1  ./  (1  +  exp(-Wl  *  Cdata(2:I+l,l) ; 1] ) ) ; 
z2  =  1  ./  (1  +  exp(-W2  *  [zl;  1])); 

if  z2>=0.1741 
lerguess  =  1; 
else 

lerguess  =  0; 
end; 

if  lerguess==l 

ler_good= [ler_good ; features (2:3)]; 
else 

ler_bad= [ler_bad ; features (2:3)]; 


'/.'/.'/.USING  PSA  FEATURES'/.'/.'/. 
PSA=[1  features( : ,55:60)] ; 
data=PSA; 
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load  psaweights  */,*/«  7  middle  nodes 
Wl=wl_7; 

W2=w2_7; 

train_data=nn_data_train; 

ave=mean(train_data( : ,2 : I+l) ) ; 
dev=std(train_data( : ,2:1+1)) ; 

average=ones(n,l)  *  ave; 
sigma=ones (n, 1)  *  dev; 

data(: ,2 : I+l)=(data( : ,2:I+l)-average) ./sigma; 
data=data’ ; 

zl  =  1  ./  (1  +  exp(-Wl  *  [data(2:I+l,l) ;1])) ; 
z2  =  1  ./  (1  +  exp(-W2  *  [zl;  1])); 

if  z2>=0.4071 
psaguess  =  1; 
else 

psaguess  =  0; 
end; 

if  psaguess==l 

psa_good= [psa_good ; features (2:3)]; 
else 

psa_bad= [psa_bad ; features (2:3)]; 


•/.'/.•/.USING  LER/PSA  FEATURES'/.'/.'/. 

combo=[l  f eatures ( : ,6)  features( : ,9)  features (: ,21) 

features ( : ,56:57)  features ( : ,59)] ; 

data=combo 

load  comboweights  '/,'/.  2  middle  nodes 
Wl=wl_2; 

W2=w2_2; 

train_data=nn_data_train; 


ave=mean(train_data( : ,2:1+1))  ; 
dev=std(train_data( : ,2:1+1))  ; 

average=ones (n, 1)  *  ave; 
sigma=ones (n, 1)  *  dev; 
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data(: ,2:I+l)=(data( : ,2:1+1) -average) ./sigma; 


data=data’ ; 

zl  =  1  ./  (1  +  exp(-Wl  *  [data(2:I+l,l) ;1])) ; 
z2  =  1  ./  (1  +  exp(-W2  *  [zl;  1])); 

if  z2>=0.2156 
comboguess  =  1; 
else 

comboguess  =  0; 
end; 

if  asmguess==l 

combo_good= [combo_good ; features (2 : 3) ] ; 
else 

combo_bad= [combo_bad; features (2:3)]; 


y.y,7. 


else ; 

•/.•/.•/.NOT  INDEXED  MICROCALCIFICATION  REGIONy.'/.'/, 

y,y,keep  x,y  coord  and  indexing  features  for  error  analysis 

toss= [toss ; str2num(f ile (3 : 5) )  xc(i)  yc(i) 
n\im_cls  index_laws  (1)  index_laws  (2)]  ; 

end; 

end; 

‘/program  complete 


'e  /o  /e  h  h  /o  /o  /o  /o  /o  /e  /o  h  h  /o  /•  /o  /•  /•  /«  /«  /•  /o  /o  /q  /•  /o  /o  /o  /o  /o  /e  /«  /t  /t  /•  /«  /p  h  to  /o  /e  /o  /•  /•  /•  /•  /o  h  h  /o  /o  /o  /o  /e  /e  /e  /o  /o  /q  /e  /o  /e  /«  /« 

vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvyvvvvvvvvvvvvvv 

/e  /e  /e  /e  /e  /o  /o  /p  /p  /p  /p  /p  /p  /p  /p  /e  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /#  h  h  /p  /p  /p  /p  /p  /p  /p  /p  fp  h  fp  tp  tp  ip  fp  h  h  h  /p  Ip  Ip  h  h  h  h  Ip  Ip  Ip  Ip 


p/p/p/p/ p/p/p/p/p/p/p/p/p/p/p/p/p/p/p/p/p/p/p/p/p/p/p/p/p/p/p/p/p/pyp/p/p/p/p/p/p/p/p/p/p/p/p/p/p/p/p/p/p/p/p/p/p/p/p/p/p/p/p/p/p/p/p/p/ 

/p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  h  Ip  Ip  Ip  Ip  Ip  Ip  Ip  Ip  Ip  Ip  Ip  Ip  Ip  Ip  Ip  Ip  Ip  Ip  Ip  Ip  Ip  Ip  Ip  Ip 


•/. 

'/,  main_seg.m 

•/. 


•/»/»/«/•/«/»/•/•/•/•/•/•/ 


P/P/ P/P/P/P/ 

p  Ip  Ip  h  Ip  Ip  Ip  Ip  Ip  Ip  Ip  Ip 
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y,  Program  main_seg.m  that  executes  the  segmentation  functions  for  a 
y,  1024  by  2048  hit  and  miss  threshholded  mammogram, 
y.  Original  by  Dru  McCandles;  Modified  by  Ron  Dauk 

y.  The  requirements  to  run  this  program  are: 

y. 

y«  1 :  A  1024x2048  matrix  called  IMG  exists  in  memory 
y.  The  program  parameters  are: 

y. 

'/o  top_margin  uncertainty  edge  distance  from  top/bottom  of  IMG 
y,  size_margin  uncertainty  edge  distance  from  sides  of  IMG 
y«  min_energy  minimum  "energy"  required  for  ROI  to  be 
%  considered  relevant  after  first  pass 
y  thresh  minimum  energy  to  survive  the  second  pass 
'/o  box_row  #  rows  in  the  sliding  window  (size  in  rows) 
y,  box_col  #  cols  in  the  sliding  window  (size  in  cols) 
y,  -NOTE:  the  (image  size  -  margin)  /  box  size 
y  must  be  ein  integer  !  !  ! 


•/  •/  •/  •/  •/  •/  •/  V  V  V  V  V  V  V  V  •/  V  V  V  •/  V  •/  V  V  V  V  •/  V  V  V  V  V  •/  V  •/  V  •/  •/  •/  •/  V  •/  ®/  •/  •/  •/  •/  V  •/  V  •/  •/  •/  •/  V  V  •/  •/  •/  ®/  V  V 


y  Initial  Threshhold 


/o  /o  /•  /•  /o  /o  /« /« /« /«  /o  /o  /•  /«  /•  /«  /«  h  /«  /o  /e  /o  /o  /o  /•  /•  /•  /•  /o  /o  /o  /•  /•  /•  /•  /»  /•  h  h  h  /«  /•  /•  /•  /•  /p  /e  /o  /o  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  h  h  /p  /p 


y^nn  =  mecin(IMG( : )  )  ; 

y.sd  =  std(IMG(:))  ; 

hi_t  =1;  y  normally  7*sd; 

yiMG(l:20,:)  =  zeros (20, 2028) ; 

•/.MASK  =  IMG  >  hi_t; 

yiMG  =  IMG.*MASK; 

•/.clear  MASK 
•/.figure  (1) 

•/.image  (IMG) 


y  Parameter  Definitions 


•/.•/«•/.•/.•/«•/.  /«  /.  ft  I,  It  It  It  It  It  It  It  It  It  It  It  It  It  It  It  It  It  It  It  It  It  It  It  It  hit  hh  h  hit  h  It  It  It  It  It  It  It  /.  It  It  It  It  It  It  It  It  It  It  It  It  It  It  It  It  It 


top_margin  =  0; 
side_margin  =  0; 

•/.min_energy  =  600;  •/.  usually  =  600 
•/.thresh  =  1400;  •/.  usually  =  1400 
box_row  =  roi_size; 
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box_col  =  roi_size 


/o  /o  /o  /o  /«  /e  /o  /o  /»  /q  /9  /e  h  h  h  h  h  h  h  h  /e  h  h  h  h  h  h  /«  h  /e  /« /«  /«  /•  /•  /o  /•  /«  /«  /«  /•  /•  h  h  h  /» /o  ft  ft  ft  ft  it  it  it  it  it  it  it  it  it  it  it  it  it  it  it  it  it 

I  BEGIN  PROGRAM 


0  it  it  it  it  it  it  it  it  it  it  ft  ft  ft  it  it  it  it  ft  it  ft  it  ft  it  ft  ft  it  ft  ft  it  ft  ft  ft  ft  ft  it  ft  ft  it  ft  ft  ft  ft  it  ft  ft  it  ft  ft  ft  ft  ft  ft  ft  ft  ft  it  ft  ft  it  ft  ft  ft  ft  ft  ft  ft  ft 


y,  Compute  the  "Energy"  matrix  E 

E  =  slider(IMG,top_margin,side_margin,box_row,box_col) ; 

7,  Keep  only  those  regions  which  have  at  least  the  minimum  energy 

min_energy  =1;  7,  normally  .7*mean(E( : )  )  ; 

[I,J]  =  find(E  >  min_energy) ; 

I_mid  =  (I-l)*box_row+top_margin+(box_row/2) ; 

J_mid  =  ( J-l)*box_col+side_margin+(box_col/2) ; 

7.  Perform  the  centroid  migration 

[G , EN]  =SEG (IMG , I , J , top .margin , side .margin, min.energy ,box_row , box.col) ; 

thresh  =  1;  7.  normally  4*min_energy ; 

[I.final, J.finaljE.final]  =  reducer(G,EN,thresh) ; 

for  i  =  l:length(I_final) 

if  (I_final(i)<(box.row/2) I I_final(i)>(1020  -  (box_row/2) ) ) 

E.finaKi)  =  0; 

elseif  ( J.f inal(i)<(box.col/2) I J.f inal(i)>(2028  -  (box.col/2) ) ) 

E_final(i)  =  0; 

end 

end 

F  =  find(E.f inal) ; 
for  i  =  l:length(F); 

I_clear(i)  =  I_final(F(i) ) ; 

J.clear(i)  =  J.f inal (F(i) ) ; 

E_clear(i)  =  E.f inal(F(i) ) ; 
end 

[m,outl]=size(E.clear) ; 

Crank, index] =sort(E_clear’) ; 
rank=f lipud(rank) ; 
index=flipud(index) ; 

out2=zeros(outl,2) ; 
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for  i=l:outl; 

out2(i,l)=I_clear (index (i))  ; 
out2(i,2)=J_clear(index(i)) ; 
end; 


clear  I_clear  J_clear  E_clear  index  rank  I_final  J_final 
clear  E_final  top_margin  side .margin  box.row  box.col  hi_t  F  I 
clear  J  I_mid  J_mid  E  EN  G  min.energy 


•/  V  •/  y  y  y  •/  y  y  y  y  y  y  y  y  •/  y  y  •/  y  y  •/  •/  •/  y  y  y  y  y  y  y  y  y  •/  •/  y  •/  y  y  */  y  y  y  y  y  y  y  y  y  y  y  y  y  •/  y  •/  y  y  y  y  •/  y  y  y  y  y  y  y  y  y  y 

/« /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /o  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /e 


vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv 

/A  /A  /a  /a  /a  /a  /a  /a  /a  /a  /a  /a  /A  /A  /A  /A  /A  /A  /a  /a  /a  /a  /a  /A  /a  /A  /A  /A  /A  /A  /a  /A  /A  /A  /A  /a  /A  /a  /a  /A  /A  /a  /A  /A  /a  /a  /A  /A  /A  /A  /a  /a  /A  /A  /A  /A  /A  /A  /A  /a  /a  /A  /A  /A  /A  /A  /A  /A  /A  /A  /A 


function  E  =  slider(IMG,top_margin,side_margin,nrow,ncol) ; 

‘/,  function  E  =  slider(IMG,top_margin,side_margin,nrow,ncol)  ; 

y. 

y,  function  that  returns  the  matrix  E  of  the  sum  of  the  abs  of  the  pixel 
y,  values  in  IMG,  where  IMG  is  a  1020x2028  reconstructed  wavelet  image 
y,  of  a  mammogram.  Each  entry  in  E  is  the  ’energy’  of 
y,  a  nrow  by  ncol  size  piece  of  IMG,  with  a  1-to-l  correspondance 
y,  between  the  location  of  E(i,j)  and 

y,  the  location  of  the  99x100  piece  of  IMG  for  which  it  was  computed. 

y. 

*/«  To  determine  where  E(i,j)  came  from,  find: 
y.  row.start  =  (i-l)*nrow  +  top  .margin  +  1 
y,  col. start  =  (j-l)*ncol  +  side  .margin  +  1 

y. 

y,  The  roi  is  located  at 

'/o  (row.start :  row.start+nrow-1 ,  col.start :  col.start+ncol-1) 

y. 

y.  The  energy  is  computed  by  sliding  a  non-overlapping 
'/«  nrowxncol  box  over  IMG 

[nr,nc]  =  size(IMG);  '/,  This  should  be  1020  x  2028  !! 
rboxes  =  (nr  -  2*top .margin) /nrow; 
cboxes  =  (nc  -  2*side.margin)/ncol; 

for  X  =  1 : cboxes 
for  y  =  1: rboxes 

row.index  =  top .margin  +  ((y-l)*nrow)  +1; 
col. index  =  side .margin  +  ((x-l)*ncol)  +  1; 

ROI  =  IMG (row.index: (row.index+nrow-1) ,col.index: (col.index+ncol-1) ) ; 

E(y,x)  =  sum(sum(abs(R0I))) ; 

end 


end 


0  /o  /Q  /e  h  h  h  h  h  h  h  h  h  h  h  h  h  h  /o  h  /o  /e  /•  /«  h  h  /e  /e  /o  /« /« /e  /q  /ft  h  /<  /«  /«  /ft  /•  /•  /ft  /ft  /o  /e  /ft  /ft  /ft  /ft  /ft  /o  /o  /e  /ft  /ft  /•  /ft  /o  /o  /q  /ft  /«  /«  /e  /« /ft  /ft  /o  /o  /e  /ft  /o 

ft/o/o/ft/o/o/c/«/ft/./ft/./ft/ft/ft/o/./ft/ft/o/ft/o/e/ft/o/o/c/ft/ft/ft/ft/./ft/./a/fty*/ft/ft/ft/B/ft/ft/ft/ft/ft/eye/#/#/ft/./ft/ft/e/o/o/ft/ft/ft/o/o/o/o/»/ft/#/o/ft/o/oyft/ 

/ft  /ft  /•  /ft  /ft  /ft  /ft  /ft  /ft  /ft  /ft  /ft  /ft  /ft  /ft  /ft  /ft  /ft  /ft  /ft  /ft  /ft  /ft  /ft  /ft  /ft  /ft  /•  /ft  /ft  /ft  /ft  /ft  /ft  /ft  /ft  /ft  /ft  /«  /ft  /ft  /ft  /ft  /ft  /ft  /ft  /ft  /ft  /ft  /ft  /ft  /ft  /ft  /ft  /ft  /ft  /ft  /ft  /ft  /ft  /ft  /ft  /ft  /ft  /ft  /ft  /ft  /ft  /ft  /ft  /ft  /ft 


function  [G,EN] =SEG(IMG, I , J ,top_margin,side_margin,min_energy , srow, scol) ; 
’/.function  [G , EN]  =SEG  ( IMG ,  I ,  J ,  top_maTgin ,  side_margin , min_energy ,  srow ,  s col ) ; 


tol  =  3; 

[Sr, Sc]  =  size (IMG); 
L  =  length(I) ; 


for  i  =  1:L 

i; 

row_index  =  top_margin  +  ((I(i)-l)*srow)  +  1; 
col_index  =  side_margin  +  ( ( J(i)-l)*scol)  +  1; 

ROI=IMG(row_index : (row_index+srow-l) ,col_index; (col_index+scol-l) ) ; 
C  =  centroid(abs (ROD )  ; 

'/,  recompute  the  new  ROI 

nri  =  C(l)  +  row_index  -  (srow/2)  +  1; 
if  nri  <  (top_margin  +1) 
nri  =  top_margin  +  1 ; 
end 

nrif  =  C(l)  +  row_index  +  (srow/2); 
if  nrif  >  (Sr  -  top_margin) 
nrif  =  Sr  -  top_margin; 
end 

nci  =  C(2)  +  col_index  -  (scol/2)  +  1; 
if  nci  <  (side_margin  +1) 
nci  =  side_margin  +  1; 
end 

ncif  =  C(2)  +  col_index  +  (scol/2); 
if  ncif  >  (Sc  -  side_margin) 
ncif  =  (Sc  -  side_margin) ; 
end 

ROI  =  IMG(nri;nrif ,nci:ncif) ; 

OCX  =  [C(l)+row_index  C(2)+col_index] ; 
row_index  =  nri; 
col_index  =  nci; 

C  =  centroid(abs(ROI) ) ; 
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NCX  =  [C(l)+nri  C(2)+nci] ; 
d  =  sqrt((OCX  -  NCX)* (OCX  -  NCX)>); 
EN(i)  =  suin(suin(abs(ROI)))  ; 
n  =  1; 

while  d  >  tol 

nri  =  C(l)  +  row_index  -  (srow/2)  +  1; 
if  nri  <  (top_margin  +1) 
nri  =  top_margin  +  1; 
end 

nrif  =  C(l)  +  row_index  +  (srow/2); 
if  nrif  >  (Sr  -  top_margin) 
nrif  =  Sr  -  top_margin; 
end 

nci  =  C(2)  +  col_index  -  (scol/2)  +  1; 
if  nci  <  (side_margin  +  1) 
nci  =  side .margin  +  1; 
end 

ncif  =  C(2)  +  col.index  +  (scol/2); 
if  ncif  >  (Sc  -  side.margin) 
ncif  =  (Sc  -  side.margin) ; 
end 

ROI  =  IMG (nri ; nrif , nci: ncif) ; 

EN(i)  =  sum(s™(abs(ROI))) ; 
if  EN(i)  <  min.energy 
d  =  0; 
end 

OCX  =  [C(l)+row. index  C(2)+col_index] ; 
row.index  =  nri ; 
col.index  =  nci; 

C  =  centroid(abs (ROI) ) ; 

NCX  =  [C(l)+nri  C(2)+nci] ; 
d  =  sqrt((0CX  -  NCX)* (OCX  -  NCX)'); 
n  =  n  +  1; 
end 

new_I(i)  =  C(l)  +  nri; 
new_J(i)  =  C(2)  +  nci; 
end 

G  =  [new_I’  new_J’]; 


0  /o  h  /e  /o  /e  /e  /e  /e  /o  /o  /o  /o  /o  /ft  /ft  /ft  /ft  /ft  /ft  /ft  /ft  /•  /«  /ft  /ft  /ft  /ft  /ft  /ft  /ft  /o  /o  /«  /«  /•  /#  /•  /•  /ft  /o  /«  /ft  /o  /•  /ft  /ft  /»  /»  /•  /e  /e  /ft  /ft  /ft  /ft  /•  /ft  /ft  /ft  /ft  /ft  /ft  /ft  /ft  /ft  /ft  /ft  /o  /o 


function  [I_f inal , J_f inal ,E_f inal]  =  reducer (G,EN, thresh) 
‘/o  program  reducer. m  that  removes  duplicate  rois 
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y,(i.e.,  rois  that  have  centers  that  are  within  20  pixels  of  each  other) 
‘/o  -  it  keeps  the  roi  with  the  highest  energy. 

‘/o  The  row, col  components  are  in  a  L  by  2  matrix  G,  and  the  Energy  is  in 
y,  a  L  by  1  vector  EN 

L  =  length(EN) ; 
wun  =  ones(L,l) ; 


for  i  =  1:L 
tmp  =  w\in*G(i ,  : ) ; 

A  =  G  -  tmp; 

D  =  sqrt(diag(A*A’ ))  ; 

BIST  =  [BIST  B]  ; 

BIST(i,i)  =  1000; 
end 

[II, JJ]  =  findCBIST  <  30); 
for  i  =  l:length(II) ; 
if  ENdKi))  >  EN(JJ(i)) 
EN(JJ(i))  =  0; 
else 

ENdKi))  =  0; 

end 

end 


for  i  =  1:L 

if  EN(i)  >  99999500  ’/,  usually  9500 

EN(i)  =  0; 

end 

end 


F  =  find(EN  >  thresh) ; 
for  i  =  l:length(F); 

I. finald)  =  G(F(i),l); 

J. finald)  =  G(F(i),2); 
E_final(i)  =  EN(F(i)); 
end 


/o  /o  />  />  />  It  It  It  It  It  It  It  It  It  It  It  It  It  It  It  It  It  It  It  It  It  It  It  It  It  It  It  It  It  It  It  It  It  It  It  It  It  It  It  It  It  It  It  It  It  It  It  It  It  It  It  It  It  It  It  It  It  It  It  It  It  It  It  It  It 

vitiiuvitUvitiimtVitVitVitmtVitUiuuuw 


function  C  =  centroid(ROI) ; 
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y,  function  C  =  centroid(ROI) ; 

•/. 

y,  This  function  computes  a  weighted  centroid  C  =  [rc  cc]  of  the  matrix  ROI 

y. 


[I,J,V]  =  find(ROI); 

S  =  s\im(V)  ; 

rc  =  sum(I.*V)/S; 

cc  =  sum(J.*V)/S; 

C  =  [round(rc)  round(cc)] ; 


•/•/yyyyyyyyvvvvyyyy “/•/•/•/•/•/*/yy*/vy*/yy*/yyyy*/y*/yyyy'/v*/y'/yyyy*/yyy 

/e  /e  /e  /e  /o  /e  /« /« /e  /e  /e  /o  /a  /a  /a  /a  /a  /a  /a  /a  /a  /a  /a  /a  /a  /a  /a  /a  /a  /a  /a  /a  /a  /a  /a  /a  /a  /a  /a  /a  /a  /a  /a  /a  /a  /a  /a  /a  /a  /a  /a  /a  /a  /a  /a  /a  /a  /a  /a  /a  /a  /a  /a  /a  /a  /a  /a  /a  /a  /a 


function  [num,E,D,M,csize]  =  cluster(IMG) ; 

*/,  function  [num,E,D,M,csize]  =  cluster(IMG) ; 

y. 

y,  This  function  takes  in  an  image  IMG  and  determines  the  number  of  unique 
y,  clusters  (niom) ,  the  abs  energy  of  each  cluster  (E)  ,  the  distance  of  the 
y,  center  of  each  cluster  from  the  centroid  (D) ,  and  the  center  coordinate 
y,  of  each  cluster  (M) . 

y. 

y,  The  function  works  using  a  two-pass  loop:  The  first  pass  groups  all 
y,  pixels  that  are  left-right  of  each  other  together  first,  euid  then 
y,  top-bottom  second  by  assigning  each  pixel  a  cluster  number  C(i). 
y,  The  second  pass  then  groups  all  of  the  ’sub-clusters’  together  that 
y,  are  top-bottom  connected  by  reassigning  all  the  cluster  numbers  from  one 
y,  to  match  the  other. 

[I,J,V]  =  find(abs(IMG)) ; 

1  =  length(I) ; 

y,  first  pass  -  assign  same  row  clusters 

C(l)  =  1; 
cmax  =  1 ; 
cind  =  cmax; 
for  i  =  2:1 

new_col  =  J(i)  -  J(i-l); 
if  new_col  ==  0 

t  =  find((I  ==  I(i))  &  (J  ==  (J(i)  -  1))); 
if  t  ==  □ 

if  I(i)  ==  (I(i-l)  +  1) 
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C(i)  =  cind; 
else 

cmax  =  cmax  +  1; 
cind  =  cmax; 

C(i)  =  cind; 

end 

else 

cind  =  C(t) ; 

C(i)  =  cind; 
end 

elseif  new_col  ==  1 

t  =  find((I  ==  I(i))  &  (J  ==  (J(i)  -  1))); 

if  t  ==  [] 

cmax  =  cmax  +  1 ; 

cind  =  cmax; 

C(i)  =  cind; 
else 

cind  =  C(t) ; 

C(i)  =  cind; 

end 

else 

cmax  =  cmax  +  1; 
cind  =  cmax; 

C(i)  =  cind; 

end 

end 

’/,  second  pass  -  assign  same  col\imn  clusters 
for  i  =  2:1 

if  (J(i)  ==  J(i-l))  &  (I(i)  ==  I(i-1)+1) 
if  C(i)  ~=  C(i-l) 
t  =  C(i-l) ; 

T  =  find(C  ==  t) ; 

q  =  length(T) ; 

for  k  =  l:q 

C(T(k))  =  C(i); 

end 

end 

end 

end 

CENT  =  centroid (IMG) ; 

7,  determine  the  number  of  unique  clusters,  size,  energy  &  distance 
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num  =  0; 
for  i  =  1 : cmax 
T  =  find(C  ==  i) ; 
if  T  ~=  [] 
num  =  num  +  1 ; 
s  =  length(T) ; 
csize(num)  =  s; 
e  =  0; 
rowsum  =  0; 
colsum  =  0; 
for  k  =  1 : s 
e  =  e  +  V(T(k)); 

rowsum  =  rowsum  +  V(T(k))*I(T(k)) ; 
colsum  =  colsum  +  V(T(k))*J(T(k)) ; 
end 

E(num)  =  e; 
rowm  =  rowsum/e; 
colm  =  colsum/e; 

Mn  =  [rowm  colm] ; 

M(num,l:2)  =  Mn; 

D(num)  =  sqrt((Mn  -  CENT)*(Mn  -  CENT)’); 

end 

end 


•/  •/  •/ «/ »/  •/  •/  •/  •/ «/  •/  •/  •/  •/  •/  •/ «/  •/  •/  •/  •/  •/  •/  •/  •/  •/  •/  •/  •/  •/  •/  •/  •/  •/  •/  •/  •/  •/  •/  •/  •/  •/  •/  •/  •/  •/  •/  •/  •/  •/  •/  •/  •/  •/  •/  •/  •/  •/ »/  •/  •/  •/  •/  •/  •/  •/  •/ »/  •/ 


« /o  /o  /o  /o  /Q  /o  /•  /«  /«  /«  /«  /«  /«  /«  /o  /o  40  /p  fp  /p  /p  /p  /p  /p  /p  /p  /p  h  h  h  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  h  h  /p  /p  /p  h  h  h  h  h  /p  /p  /p  /p  /p  /p  /p  /p  /p  /p  h  h 


'/.single  hidden  layer,  sigmoid  activation  function,  single  output 
'/.neural  net 

•/.  TRAINING  IN  BATCH  MODE 

'/. [err_c0 ,err_cl , W1 , W2]  =seltrn(data,HL,maxepochs , Ir, clamp , type) ; 

•/. 

'/.INPUT: 


’/.data:  1st  col  class,  remaining  cols  features,  #  of  row=#  of  samples 
'/.  HL:  number  of  desired  hidden  nodes 

‘/.maxepochs :  maximiim  number  of  epochs  to  train 
’/.  Ir:  learning  rate 

’/.  clamp:  clamp  output  >  1-clamp  to  1-clamp  or  <clamp  to  clamp 

’/.  type:  select  backprop  method:  0  normal,  1  imbalanced 

'/. 


’/.OUTPUT: 

'/.  err_c0:  error  for  class  0  for  each  epoch 
’/.  err_cl:  error  for  class  1  for  each  epoch 
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y,  Wl:  final  weights  for  input  to  hidden  layer 
y,  W2:  final  weights  for  hidden  layer  to  output  node 

y. 

‘/.This  program  will  train  a  neural  net  for  an  imbalanced  training  set 
'/.with  two  classes  with  a  selectable  number  of  hidden  nodes  and  a 
’/.single  output  node. 

function 

[err_cO,err_cl ,W1 ,W2,dzdx]=seltrn(data,HL,maxepochs,lr, clamp, type) 

’/.y.rand  seed  value 

rand( ’ seed’ , sum(100*clock) ) ; 

[n,I]=size(data) ; 

1=1-1; 

’/.normalize  data 

ave=mean(data( : ,2:1+1)) ; 
dev=std(data( : ,2:1+1)) ; 

average=ones(n, 1)  ♦  ave; 
sigma=ones(n, 1)  *  dev; 

data(: ,2 : I+l)=(data( : ,2 : I+l)-average) ./sigma; 
data=data’ ; 

’/.initialize  weights  in  the  net 

Wl=rand(HL,I+l)-0.5;  ’/.[HL  by  I+l] 

W2=rand(l,HL+l)-0.5;  ’/.[I  by  HL+1] 

err_cO=[]  ; 
err_cl=  []  ; 
epoch=0 ; 

while  epoch<maxepochs 

’/.Initialize  variables 
mseO=  []  ; 
msel=  []  ; 

index=randperm(n) ; 
countO=l ; 
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count 1=1 
zl_cO=[] 
zl_cl=[] 
z2_c0=  [] 
z2_cl=[] 
X_cO=  □; 
X_cl=[]  : 
n0=0; 
nl=0; 


for  i=l:n; 

y, desired  output 
d(i)=data(l .index (i) ) ; 

y, feature  vector  with  bias(I+l  by  n) 
X(: ,i)  =  [data(2:I+l,index(i)) ;  1]  ; 


■/.compute  activation  fuctions 

’/.hidden  layer  (HL  by  n) 

zl( : ,i)=l ./(l+exp(-Wl  *  X(:,i))); 

■/.output  layer  (1  by  n) 
z2(l,i)=l./(l+exp(-W2  *  [zl ( : , i) ; 1] ) ) ; 

■/.clamp  output  values 

if  z2(l ,i)>(l-clamp) ; 

z2(l ,i)=l-clamp; 

elseif  z2(l,i)<clamp; 

z2(l,i)=clamp; 

else ; 

z2(l,i)=z2(l,i) ; 
end; 

■/.divide  input,  hidden  aind  output  layer  results  by  class 
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if  d(i)==l; 

X_cl=[X_cl  X(:,i)]; 
zl_cl=[zl_cl  zl(:,i)]; 
z2_cl=[z2_cl  z2(l,i)]; 
nl=nl+l ; 
else ; 

X_cO=[X_cO  X(: ,i)] ; 
zl_cO=[zl_cO  zl(:,i)]; 
z2_c0=[z2_c0  z2(l,i)]; 
nO=iiO+l ; 
end; 


end;  */,‘/,all  train  samples  through  the  net 


"/.find  first  derivative  of  hidden  eind  output  layers 

‘/.derivative  of  hidden  layer (HL  by  nO) 
dzl_cO=zl_cO.*(l-zl_cO) ; 

"/.derivative  of  output  layer  (1  by  nO) 
dz2_c0=z2_c0.*(l-z2_c0) ; 

‘/.derivative  of  hidden  layer  (HL  by  nl) 
dzl_cl=zl_cl.*(l-zl_cl) ; 

‘/.derivative  of  output  layer (1  by  nl) 
dz2_cl=z2_cl . *(l-z2_cl) ; 

dout_c0=dz2_c0  .*  (clamp-z2_c0) ;  ‘/.(I  by  nO) 

temp_cO=W2’  *  dout_cO;  ‘/,(HL+1  by  nO) 

dhl_cO  =  dzl_cO  .*  temp_cO(l:HL,:);  ‘/.(HL  by  nO) 

dout_cl=dz2_cl  .♦  (l-clamp-z2_cl) ;  '/.(I  by  nl) 

temp_cl=W2’  *  dout_cl;  ‘/.(HL+1  by  nl) 

dhl_cl  =  dzl_cl  .*  temp_cl(l:HL, :)  ;  ‘/.(HL  by  nl) 

‘/.calculate  gradients  for  each  class 

GE_Wl_cO=dhl_cO  *  X_cO’; 

GE_W2_cO=dout_cO  *  Czl_cO;ones(l,nO)] ’ ; 

GE_Wl_cl=dhl_cl  *  X_cl’; 
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GE_W2_cl=dout_cl  *  [zl_cl;ones(l,nl)] ’ ; 

‘/oupdate  the  weights 
if  type==0; 

’/.regular  backprop  GE=GE_cO  +  GE_cl 

W1  =  W1  +  lr*(GE_Wl_cO  +  GE_Wl_cl);  ’/.(HL  by  I+l) 

W2  =  W2  +  lr*(GE_W2_cO  +  GE_W2_cl);  ’/.(I  by  HL+1) 

else ; 

‘/.imbalanced  training  set 

'/.find  imit  vectors  for  each  gradient 

unit_GE_Wl_cO=GE_Wl_cO/sqrt (sum(sum(GE_Wl_cO . ~2) ) ) ; 
unit_GE_Wl_cl=GE_Wl_cl/sqrt(sum(sum(GE_Wl_cl . ~2) ) ) ; 
unit_GE_W2_cO=GE_W2_cO/sqrt (sum(GE_W2_cO . ~2) ) ; 
unit_GE_W2_cl=GE_W2_cl/sqrt (sum(GE_W2_cl . “2) ) ; 

’/.set  direction  to  the  bisecting  angle  between  the  class  GE  vectors 

ang_GE_Wl=(unit_GE_Wl_cO  +  unit_GE_Wl_cl)/2; 
ang_GE_W2=(unit_GE_W2_cO  +  unit_GE_W2_cl)/2; 

’/.calculate  magnitude  of  GE  vectors 

mag_GE_Wl=sqrt (sum(sum( (GE_Wl_cO  +  GE_Wl_cl) . ~2) ) ) ; 
mag_GE_W2=sqrt (sum( (GE_W2_cO  +  GE_W2_cl) . “2) ) ; 

’/.create  new  GE  vectors 

GE_Wl=mag_GE_Wl*ang_GE_Wl ; 

GE_W2=mag_GE_W2*ang_GE_W2 ; 

‘/.update  weights  with  new  backprop 

Wl=Wl+lr*GE_Wl; 

W2=W2+lr*GE_W2 ; 

end; 

’/.calculate  the  mse  for  each  class 
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for  i=l;n 
if  d(i)==0; 

mseO(countO)  =  (clEanp-z2(i))“2; 
couiitO=coimtO+l ; 
else ; 

msel (count l)=(l-clamp-z2(i) ) “2 ; 

count l=count 1+1 ; 

end; 

end; 

compute  epoch  error  for  each  class 

epoch_err_cO=mean(mseO) ; 
epoch_err_cl=mean(msel) ; 
err_cO= [err_cO  epoch_err_cO] ; 
err_cl= [err_cl  epoch_err_cl] ; 
epoch=epoch+l ; 

end; 

y,  Ruck  Feature  Saliency 

dzdx=zeros(l,I) ; 
for  i=l:n 

zl  =  1  ./  (1  +  exp(-Wl  *  X(:,i))); 
z2  =  1  ./  (1  +  exp(-W2  *  [zl;  1])); 
fprimel  =  zl  .*  (1-zl); 
fprime2  =  z2  .*  (l-z2); 

y.dzdx  contains  each  feature’s  saliency  for  all  training  samples 

dzdxl=abs((Wl(: ,1:1) ’*(((W2(: ,1:HL) ’*fprime2) .*fprimel))) ’) ; 
dzdx=dzdx  +  dzdxl; 

end  7,  (for  i=l:n) 

dzdx=dzdx/max(dzdx) ; 
dzdx 
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B.2  C  Code 


This  section  contains  the  C  code  developed  to  accomplish  the  local  thresholding.  The 
program  requires  a  1024  by  2048,  unsigned  short  integer  data  type,  binary  image  file  named 
“locaLthres-img”  be  available  in  the  current  directory.  A  parameter  file  with  the  threshold 
and  local  window  sizes  must  be  in  a  file  named  “locaLparam”  saved  as  floating  point  data 
type.  The  program  will  then  test  every  pixel  in  the  image  to  determine  if  it  is  greater 
than  the  mean  plus  the  threshold  times  the  standard  deviation  of  the  pixels  surrounding 
the  test  pixel.  The  size  of  the  region  is  defined  by  the  local  window  size  parameter.  The 
system  will  output  a  binary  image  with  ones  where  the  pixel  met  the  criteria  and  zeros 
where  it  did  not.  This  file  is  written  to  disk  with  the  name  “locaLmask”  and  saved  as  a 
floating  point  data  type. 

This  code  can  be  compiled  using  the  following  at  the  command  line  on  a  Unix  plat¬ 
form.  cc  -o  output.exe  locaLthres.c  -Im 


#include  <stdio.h> 

#include  <math.h> 

#include  <stdlib.h> 

#define  max_rows  1024 
#define  max_cols  2048 

float  mamo [max_rows] [max_cols] ; 
float  new_mask [max_rows] [max_cols] ; 
unsigned  short  bufin[max_rows*max_cols] ; 
float  buf out  Cmax_rows*max_cols] ; 
char  header [1024] ; 

mainO 

{ 

FILE  *ifp,*ofp; 

int  nread,nitems=2,count=0,m,k; 
float  oldsum.oldsumof sqr , sum, sumof squares; 
float  mean,std,low_t,win_size,param[2] ,temp; 
int  row, col, ws; 

/*  Read  in  Mammogram  */ 

ifp  =  fopen("local_thres_img" , "r") ; 

nread  =  fread(bufin,  sizeof (rmsigned  short),  max_rows*max_cols ,  ifp); 
f close(ifp) ; 
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for  (row=0 ; row<max_rows ; row++) 
for  (col=0 ; col<max_cols ; C0I++) 

mamoCrow] [col]=  (float)  bufin[row*max_cols+col] ; 

/*  Read  in  Values  for  parameters*/ 
ifp  =  fopen("local_param" , "r") ; 

nread  =  fread(param,  sizeof (float) ,  nitems,  ifp); 
fclose(ifp) ; 

low_t=param[0] ; 
win_size=param[l] ; 
ws  =  (int)  win_size; 

/*  Fill  outer  edge  of  mask  with  zeros  */ 

for  (row=0;row<((ws-l)/2) ;row++) 
for  (col=0 ; col<max_cols ; C0I++) 
new_mask [row] [col]  =0.0; 

for  (row=max_rows-((ws-l)/2) ;row<max_rows;row++) 
for  (col=0;col<max_cols;col++) 
new.mask [row] [col] =0.0; 

for  (row=((ws-l)/2) ;row<max_rows-((ws-l)/2) ;row++) 
for  (col=0;col<((ws-l)/2) ;col++) 
new_mask [row]  [col]  =0.0; 

for  (row=((ws-l)/2) ;row<max_rows-((ws-l)/2) ;row++) 
for  (col=max_cols- ( (ws-1) /2) ; col<max_cols ; col++) 
new_mask [row] [col] =0.0; 

/*  test  first  pixel  */ 

sum  =  0.0; 
sumof squares  =  0.0; 
for  (row=0;  row<ws;  row++) 
for  (col=0;  coKws;  col++) 

{ 

sum  =  sum  +  mamo [row] [col] ; 

sumof squares=sumofsquares+mamo [row] [col] *mamo [row]  [col]  ; 

} 

oldsum  =  sum; 

oldsumofsqr  =  sumof squares ; 
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m  =  (ws-l)/2; 

mean  =  sum/ (win_size*win_size) ; 

temp= (sumof squares- ( (sum*sum) / (win_size*win_size) ) ) / (win_size*win_size-l) 
if  (temp<=1.0) 
std=temp; 
else 

std=  (float)  sqrt(  (double)  temp); 


if  (mamo  [m]  [m]  >mean+low_t*std  kk  meein>1200 . 0) 

new_mask [m] [m] =1 . 0 ; 

else 

new_mask[m] [m]=0.0; 

/*  test  all  other  pixels  */ 

for  (row=m;  row<max_rows-m;  row++) 

{ 

for  (col=m+l;  col<raax_cols-m;  col++) 

{ 

for  (k=-m;  k<m+l;  k++) 

{ 

sum  =  sum  -  mamoCrow+k] [col-m-1]  +  mamo [row+k]  [col+m] ; 

sumof squares=sumofsquares-mamo [row+k] [col-m-1] *mamo [row+k] [col-m-1] 

+  mamo [row+k]  [col+m] *mamo [row+k] [col+m] ; 

} 

mean  =  sum/(win_size*win_size) ; 

temp=(sumof squares-( (sum*sum)/(win_size*win_size) ) )/ (win_size*win_size-l) 
if  (temp<=1.0) 
std=temp; 
else 

std=  (float)  sqrt(  (double)  temp); 

if  (mamo [row] [col] >mean+low_t*std  &&  mean>1200.0) 
new_mask [row] [col] =1.0; 

else 

new_mask [row] [col] =0.0; 

} 


sum  =  oldsum;  /*  update  sum  and  sumof squares  */ 

sumof squares  =  oldsumofsqr; 
for  (k=-m;  k<m+l;  k++) 

{ 

sum  =  sum  -  mamo[row-m]  [m+k]  +  mamo[row+m+l]  [m+k]  ; 

sumof squares  =  sumof squares  -  mamo [row-m] [m+k] *mamo [row-m] [m+k] 
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+  mamo [row+m+1] [m+k] *mamo [row+m+1] [m+k]  ; 

} 


oldsiom  =  sum;  /*  update  oldsum  and  OldSumOfSqr  */ 

oldsumofsqr  =  sumof squares ; 

/*  calculate  statdiff  for  1st  nonzero  */ 

/♦  output  pixel  in  next  row  */ 

mean  =  sum/(win_size*win_size) ; 

temp= (sumof squares- ( (sum* sum) / (win_size*win_size) ) ) / (win_size*win_size-l) 
if  (temp<=1.0) 
std=temp ; 
else 

std=  (float)  sqrt(  (double)  temp); 

if  (mamo [row] [col] >mean+low_t*std  &&  mean>1200.0) 
new_mask [row] [col]  =1.0; 
else 

new_mask [row] [col] =0.0; 

> 


/*  Output  mask  of  potential  regions  */ 

for  (row=0;row<max_rows ;row++) 
for  (col=0; col<max_cols ; C0I++) 
bufout [row*max_cols+col]=new_mask[row] [col]  ; 

ofp=fopen("local_mask" , "w") ; 

nread=fwrite (bufout ,  sizeof (float) ,max_rows*max_cols ,ofp) ; 
f close (ofp) ; 

printfC'mask  completed,  \n"); 

} 
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